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(54) Method and apparatus for pre-processing and packaging dass files 



(57) A method and apparatus for pre-processing 
and packaging dass files. Embodiments remove dupli- 
cate information elements from a set of dass files to 
reduce the size of individual dass files and to prevent 
redundant resolution of the information elements. Mem- 
ory allocation requirements are determined in advance 
for the set of classes as a whole to reduce the complex- 
ity of memory alkx^ation when the set of classes are 
loaded. The dass files are stored in a single package for 
eff ident storage, transfer and processing as a unit. In an 
embodiment, a pre-processor examines each dass file 
in a set of dass files to focate duplicate information in 

FIGURE 



the form of redundant constants contained in a constant 
pool. The duplicate constant is placed in a separate 
shared table, and all occurrences of the constant are 
removed from the respective constant pools of the indi* 
vidual dass files. During pre-processing, memory allo- 
cation requirements are determined for each dass file, 
and used to determine a total allocation requirement for 
the set of dass files. The shared table, the memory allo- 
cation requirements and the reduced dass files are 
packaged as a unit in a multi<lass file. 
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Description 

BACKGROUND OF THE INVENTION 
5 1. FIELD OF THE INVENTION 

[0001] This Invention relates to the field of computer software, and, wore specifically, to ot^ect-oriented computer 

applications. 

[0002] Portions of the disclosure of this patent document contain material that is subject to copyright protection. The 
10 copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclo- 
sure as it appears in the Patent and Trademark Office file or records, but othenA^ise reserves all copyright rights what- 
soever. 

2. BACKGROUND ART 

75 

[0003] With advancements in network technology, the use of networks lor facilitating the distribution of media infor- 
mation, such as text graphics, and audio, has grown dramatically, particularly in the case of the Intemet and the Wbrld 
Wide Web. One area of focus for current developmental efforts is in the field of web applications and network interac- 
tivity. In addition to passive media content, such as HTML definitions, computer users or "clients" coupled to the network 

20 are able to access or downtoad applicatk>n content in the form of applets, for example, from "servers" on the network. 
[0004] To accommodate the variety of hardware systems used by cfients. applk»tk>ns or applets are distributed in a 
platfbmn-independent fomnai such as the Java® class file format Object-oriented applications are formed from multiple 
dass files that are accessed from servers and downloaded individually as needed. Class files contain bytecode instruc- 
tions. A "virtual machine" process that executes on a specific haidware platform loads the individual dassfltes and exe- 

26 cutes the bytecodes contained within. 

[0005] A problem with the dass file format and the dass foadvig process is that dass files often contain duplicated 
data. The storage, transfer and processing of the individual dass files is thus ineffrcient due to the redundancy of the 
information. Also, an application may contain many dass files. aB of which are loaded and processed in separate trans* 
actions. This slows down the application and degrades memory allocator performance. Further, a client is required to 

30 maintain a physical connection to the server for the duration of the application in order to access dass files on demand. 
[0006] These problems can be understood from a review of general object-oriented programming and an example of 
a current network plication environment. 

Object-Oriented Programming 

35 

[0007] Object-oriented programming is a method of creating computer programs by combining certain fondamental 
building blocks, and creating relationships among and between the building blocks. The building blode in objed-ori- 
ented programming systems are called "objeds." An objed is a programming unit that groups together a data structure 
(one or more instance variables) and the operations (methods) that can use or affect that data. Thus, an objed consists 
40 of data and one or more operations or procedures that can be performed on that data. The joining of data and opera- 
tions into a unitary Ixjilding btock is called "encapsulation." 

[0008] An objed can be instmded to perform one of its methods when it receives a "message." A message is a com- 
mand or instruction sent to the objed to execute a certain metfiod. A message consists of a method seiedion (e.g., 
method name) and a plurality of arguments. A message tells the receiving objed what operations to perfonm. 

45 [0009] One advantage of objed-oriented programming is the way in which methods are invoked. When a message is 
sent to an objed, it is not necessary for tfie message to instrud the objed how to perform a certain m^hod It is only 
necessary to request that the objed execute the method. This greatiy simplifies program development 
[0010] Objed-oriented programming languages are predoninantly based on a "class" scheme. The dass-based 
objed-oriented programming scfieme is generally described in Lieberman. "Using Prototypical Ok^jeds to In^ilement 

so Shared Behavfor in Objed-Oriented Systems," OOPSLA 86 Proceedings, September 1 986. pp. 214-223. 

[001 1 ] A dass defines a type of objed that typically indudes boffi variables and methods for the dass. An objed dass 
is used to create a particular instance of an objed. An instance of an objed dass indudes the variables and methods 
defined for the dass. Multple instances of the same dass can be aeated from an objed dass. Each instance that is 
aeated from the otsjed class is said to be of the same type or dass. 

55 [001 2] To illustrate, an empfoyee objed dass can indude "name" and "salary" instance variables and a "set.salary" 
method. Instances of the employee objed dass can be created, or instantiated for each empfoyee in an organization. 
Each ot^ed instance is said to be of type "employee." Each en^foyee objed instance includes "name" and "salary" 
instance variables and the "setjsalary" method. The values assodated with the "name" and "salary" variables in each 
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employee object instance contain the name and salary of an enployee in the organization. A message can be sent to 
an employee's employee obiect instance to invoke the "setjsalary'* method to mocfify the employee's salary (i e.. the 
value associated with the "safary" variable in the employee's enployee object). 

[001 3] A hierarchy of classes can be defined such that an object dass definition has one or more siMasses. A sub- 
5 class inherits its parent's (and grandparent's etc.) definition. Each subclass in the hierarchy may add to or modify the 
behavior specified by its parent dass. Some object-oriented programming languages support multiple inheritance 
where a subclass may inherit a dass definition from more than one parent class. Other programming languages sup- 
port only single inheritance, where a subclass is limited to inheriting the dass definition of only one parent class. The 
Java programming language also provides a mechanism known as an "interface" which comprises a set of constant and 
10 abstract method dedarations. An object class can implement ttie abstract methods defined in an interface. Both sin^e 
and multiple inheritance are available to an interlace. That isw an interfece can inherit an interfece defviltton from more 
than one parent interface. 

[0014] An object is a generic term that is used in the object-oriented programming environment to refer to a module 
that contains related code and variables. A software application can be written using an object-oriented programmmg 

75 language whereby the program's functionality is implemented using objects. 

[001 5] A Java program is composed of a number of classes and interlaces. UrUike many programming languages, in 
which a program is compfled into machine-dependent executable program code, Java classes are compiled into 
machine independent byteoode class f fles. Each dass contains code and data in a platformnndependent format called 
the class file format. The computer system acting as the execufion vehide contains a program called a virtual madwie, 

20 wtiich Is responsible for exeouling the code in Java dasses. The virtual machine provides a level of abstraction k)etween 
the machine independence of the bytecode classes and the machine-dependent instruction set of the underlying com- 
puter hardware. A *dass \oader within the virtual machine is responsible fa loading the byteoode dass files as needed, 
and either an interpreter executes the bytecodes diredly. or a lust-in-time" (J IT) compiler transforms the bytecodes into 
machine code, so that they can be executed by the processor. Rgure 1 is a t)lock diagram illustrating a sanple Java 

25 network environment comprising a client ptatfbrm 102 coifaled over a network 101 to a server 100 for the purpose of 
accessing Java dass files for executk>n of a Java application or applet 

Sample Java Network Appficatton Environment 

30 [001 6] In Rgure 1 , server 100 comprises Java development environment 1 04 for use in creating the Java class files 
for a given application. The Java development environment 104 provides a mechanism, such as an editor and an applet 
viewer, for generating dass fies and previewing applets. A s^ of Java core dasses 103 comprise a library of Java 
classes that can be referenced by source files containing otherAtew Java classes. From Java development environment 
104. one or more Java source files 105 are generated. Java source files 105 contain the programmer readable dass 

35 definitions, including data structures, metiiod implementations and references to other classes. Java source files 105 
are provided to Java compiler 106, which compiles Java soufoe files 105 into compiled ".dass" files 107 that contain 
bytecodes executable by a Java virtual machine. Bytecode dass files 107 are stored (e.g.. in temporary or permanent 
storage) on server 100. and are available for download over network 101. 

[001 7] Client platform 1 02 contains a Java virtual machine (JVM) 1 1 1 which, through the use of available native oper- 
40 ating system (0/S) calls 1 12. is able to execute bytecode dass files and execute native 0/S calls when necessary dur- 
ing execution. 

[0018] Java dass files are often identified in applet tags wilNn an HTML (hypertext markup language) document A 
web server application 108 is executed on server 100 to respond to HTTP (hypertext transport protocol) requests con- 
taining URLs (universal resource locators) to HTML documents, also referred to as \veb pages." When a browser appli- 
45 cation executing on dient platform 102 requests an HTML document such as by forwarding URL 109 to web sen/er 
1 08. the browser automatically initiates the downlo^ of the dass files 1 07 Identified in the applet tag of the HTML doc- 
ument. Class files 107 are typcally downloaded from the server and loaded into virtual machine 111 indivkJuady as 
needed. 

[001 9] It is typical for the classes of a Java program to tDe toaded as late during the program's execution as possUe: 
50 they are toaded on demand from the network (stored on a server), or from a local file system, when first referenced dur- 
ing the Java program's execution. The virtual machine locates and loads each dass file, parses the dass file format, 
allocates memay for the dass% various components, and lirte the dass with other already loaded classes. This proc- 
ess makes the code in the dass readily executable by the virtual machine. 

[0020] The indivkJualized dass loading process, as it is typicafly executed, has disadvantages with respect to use of 
55 storage resources on storage devices, allocation of memory, and execution speed and continiity. Those disadvantages 
are magnified by the fad that a typical Java application can oontain hundreds or thousands of small dass files. Each 
dass file is self-contained. TNs often leads to infomiation redundancy t>etween dass ffles, for example, with two or 
more dass files sharing oonvnon constants. As a result. nuHple dasses ineff idently utilize large amounts of storage 
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space on permanent storage des^ices to separately store duplicate information. Similarly, loading each class file sepa- 
rately causes unnecessary duplication of information in application memory as well. Further, because common con- 
stants are resolved separately per dass during the execution of Java code, the constant resolution process is 
unnecessarily repeated. 

5 [0021 ] Because classes are loaded one by one, each small class requires a separate set of dynamic memory alloca- 
tions. This aeates memory fragmentatioa which wastes memory, and degrades allocator performance. Also, separate 
loading Iransactions" are required for each dass. The virtual machine searches for a class file either on a network 
device, or on a local file system, and sets up a connection to load the class and parse it. This is a relatively slow proc- 
ess, and has to be repeated for each dass. The execution of a Java program is prone to indeterminate pauses in 

10 response/execution caused by each dass loading procedure, espedally. when loading classes over a network These 
pauses aeate a problem for systems In which interactive or real-time performance is inportant. 
[0022] A further disadvantage of the individual dass loading process is that the computer executing the Java program 
must remain physically connected to the source of Java dasses during the duration of the program's execution. This is 
a prot)lem espedally for mobile or embedded computers without local disk storage or dedicated network access. If the 

15 physical connection is disrupted during execution of a Java application, dass files will be inaccessible and the applica- 
tion will fail when a new dass is needed. Also, it is often the case that physical connections to networks such as the 
Internet have a cost assodated with the duration of such a connection. Therefore, in addition to the inconvenience asso- 
dated with maintaining a connection throughout application execution, there is added cost to the user as a result of the 
physical connection. 

20 [0023] A Java archive (JAR) format has been developed to group dass files together m a single transportable package 
known as a JAR file. JAR files encapsulale Java dasses in archived, compressed format. A JAR file can be identified 
in an HTML docunient within an applet tag. When a browser application reads the HTML document and finds the applel 
tag, the JAR file is downloaded to the client computer and decompressed. Thus, a group of dass files may be down- 
loaded from a server to a client in one download transaction. After downtoading and decompressing, the archived dass 

26 files are available on the client system for incfividual loading as needed in accordance with standard class loading pro- 
cedures. The archived class files remain subject to storage ineff iciendes due to duplicated data between files, as well 
as memory fragmentation due to the performance of separate memory allocations for each class file. 

SUMMARY OF THE INVENTION 

30 

[0024] A method and apparatus for pre-processing and packaging class files is desaibed. Embodiments of the inven- 
tion remove duplicate information elements from a set of dass files to reduce ttie size of individual class files and to pre- 
vent redundant resolution of the information elements. Memory allocation requirements are determined in advance for 
the set of classes as a whole to reduce the complexity of memory allocation when the set of dasses are loaded. The 

35 dass files are stored in a single package for effident storage, transfer and processing as a unit. 

[0025] In an embodiment of the invention, a pre-processor examines each class file in a set of class files to locate 
duplicate information in the fam of redundant constants contained in a constant pool. The duplicate constant is placed 
in a separate shared table, and all occurrences of the constant are removed from the respective constant pools of the 
individual class files. During pre-processing, menrK>ry allocatfon requirements are determined for each dass file, and 

40 used to determine a total allocation requirement for the set of dass files. The shared table, the memory allocation 
requirements and the reduced class files are packaged as a unit in a multi-dass file. 

[0026] When a virtual machine wishes to toad the classes in the multi-dass file, the location of the multi-dass file is 
determined and the multi-dass file is dwifntoaded from a server, if needed. The memory allocation information in the 
multi-dass file is used by the virtual machine to allocate memory from the virtual machine's heap for tiie set of dasses. 
45 The individual dasses. with respective reduced constant pools, are foaded, along with the shared table, into the virtual 
machine. Constant resolution is carried out on demand on the respective reduced constant pools and the shared table 

BRIEF DESCRIPTION OF THE DRAWINGS 

SO [0027] 

Figure 1 is an embodiment of a Java network application environment. 

Figure 2 is a block diagram of an enixxfiment of a computer system capable of providing a suitable execution envi- 
ss ronment for an embodiment of the inventicxi. 

Figure 3 is a block diagram of an embodmient of a dass file format 
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Figure 4 is a flow diagram of a dass file pre-processing methiod in accordance with an embodiment of the invention. 
Figure 5 Is a block diagram of an mutti-dass file format In accordance with an embodiment of the invention. 

5 Figure 6 is a block diagram of the runtime data areas of a virtual machine In accordance with an entxxllment of the 
invention. 

DETAILED DESCRIPTION OF THE INVENTION 

10 [0028] The Invention is a method asid apparatus for pre-processing and packaging class files. In the foltowing desav- 
tlon. numerous specific details are s^ forth to provide a more thorough desaiption of embodiments of the inventot tt 
will be apparent however, to one skilled In the art, that the Invention may be practk:ed without these specific detsulSL tn 
other instances, well known features have not been described in detail so as not to obscure the invention. 

15 Embodiment of Computer Execution Environment f Hardware) 

[0029] An embodiment of the Invention can be implemented as computer software in the form of computer readable 
program code executed on a general purpose computer such as computer 200 illustrated In Figure 2, or in the form of 
bytecode class files executable by a virtual machine running on such a computer. A keyboard 210 and mouse 21 1 are 
20 coupled to a bi-directional system bus 21 a The keyboard and mouse are for Introducing user input to the conputer sys- 
tem and communicating that user input to central processing unit (CPU) 213. Other suitable Input devices may be used 
in addition to. or in place of. the nx)use 21 1 and keytx)ard210. I/O (lnput/output)urHt 219 coupled to bi-directional qrs^ 
tem bus 218 represents such I/O elements as a printer. AA^ (audio/video) I/O, eta 

[0030] Computer 200 includes a vkteo memay 214. main memory 215 and mass storage 21 2. all coupled to t»Hfrec- 
25 tk)nal system bus 218 along with keyboard 210. mouse 21 1 and CPU 213. The mass storage 212 may include botti 
fixed and removable media, such as magnetic, optical or magnetic optical storage systems or any other available mass 
storage technology. Bus 218 may corttain, for exarrple, thirty-two address lines for addressing video memory 214 or 
main memory 2 1 5. The system bus 21 8 also Includes, for example, a 32-blt data Ixis for transferring data between and 
among the components, such as CPU 213. main mennory 215. video memory 214 and mass storage 212. AlternatMy. 
30 multiplex data/address lines way be used instead of separate data and address fines. 

[0031] In one embodiment of the invention, the CPU 213 is a microprocessor manufactured by Motorola®, such as 
the 680X0 processor or a microprocessor manufactured by Intel®, such as the 80X86. or Pentium® processor, or a 
SPARC® miaoprocessor from Sun Microsystems®. However, any other suitable microprocessor or mlCTOConvute 
may be utilized. Main memory 215 is comprised of dynamic random access memory (DR/\M). Video memory 214 is a 
35 dual-ported video random access memory. One port of the video menmy 214 is coupled to video amplifier 216. The 
video amplifier 216 is used to drive the cathode ray tube (CRT) raster morvtor 217. Video anrplifier 216 is well known in 
the art and may be implemented by any suitat>le apparatus. This circuitry converts pixel data stored in vkfeo memory 
2 1 4 to a raster signal suitable for use by monitor 21 7. Monitor 21 7 is a type of monitor suitable for displaying giaphk: 
images. 

40 [0032] Computer 200 may also Include a communication interface 220 coupled to bus 218. Communication interface 
220 provides a two-way data communication coupling via a network link 221 to a focal network 222. For exanr^ile, if 
communication interface 220 is an irtegrated servrces digital network (ISDN) card or a modem, communication Mer- 
face 220 provides a data communlcafion connection to the corresponding type of telephone line, which comprises part 
of network link 221. If communication interface 220 Is a local area network (LA^Q card, communication Interface 220 

45 provkfes a data communication connection via network link 221 to a compatfole LAN. Wireless links are also possUe. 
In any such implementation, oonvnunicatfon interface 220 sends and receives eledrfoal. electromagnetic or optical sig- 
nals which carry dgital data streams representing varfous types of infom^rtion. 

[0033] Network link 221 typically provdes data communication through one or more networks to other data denoes. 
For example, network link 221 may provide a connection through local network 222 to host computer 223 or to data 

50 equipment operated by an Internet Service Provder (ISP) 224. ISP 224 in turn provfoes data communication serwoes 
through the world wkje packet data oomnruir«catfon network now commonly referred to as the "Interner 225. Local n^- 
work 222 and Internet 225 both use electrical, electromagnetic or optical signals which canry digital data streams Hie 
signals through the various networks and the signals on network link 221 and tfvough communication interface 220, 
which cany the digital data to and from computer 200, are exemplary forms of carrier waves transporting the informa- 

55 tlon. 

[0034] Computer 200 can send messages and receive data, including program code, through the network(s). neSnork 
link 221 . and oomnftunk:ation interteoe 220. In the internet example, server 226 might transmit a requested code for an 
application program through Intemel 225. ISP 224. focal network 222 and communication interface 220. In acoonl with 
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the invention, one such downloaded application is the apparatus for pre-processing and packaging dass files descrbed 
herein. 

[0035] The received code may be executed by CPU 213 as it is received, and/or stored In mass storage 212. or other 
non volatile storage for later execution. In this manner, computer 200 may obtain application code In the form of a car- 
rier wave, 

[0036] The computer systems described above are for purposes of example only. An embodiment of the invention 
may be implemented in any type of computer system or programming or processing environment. 

Class nie Structure 

[0037] Embodiments of the invention can be better understood with reference to aspects of the dass file formal 
Description is provided below of the Java dass file format. Also, endosed as Appendx A of this specification are Chap- 
ter 4. "TTie dass RIe Format," and Chapter 5, "Constant Pool Resolution." of The Jaaa Virtual Machine Spedfication, 
by Tim Lindholm and Frank Yellin. published by Addison-Wesley in September 1996, ©Sun Microsystems. Inc. 
[0038] The Java class file consists of a stream of 8^ bytes, with 1 643it 32-brt and 64-bit structures constructed from 
consecutive 8-bit bytes. A single dass or interface file stnxrture is contained in the dass file This class file stmcture 
appears as follows: 



ClassFile ( 
u4 magic 
u2 minor^version; 
u2 major_version; 
u2 constant_pooLcoimt; 
q)_info constant4)ool[constant_pooLcount-l]; 
u2 access^flags; 
u2 this_dass; 
u2 super_class; 
u2 inter£aces_coiint; 
u2 interfoces[interfaces.count); 
u2 fields_coimt; 
field_info fields[fields.count]; 
u2 methods^count; 

method_info methods[inethods_count]; 
u2 attributes^count; 

attribute.info attributes[attributes_count]; 



where u2 and u4 refer to unsigned two-byte and four^jyte quantities. This structure is graphically fflustrated in Figure 3. 
[0039] In Figure 3. dass file 300 conprlses four-byte magic value 301 . two-byte minor version number 302, two-byte 
maja version number 303. two-byte constant pool count value 304. constant pod taUe 305 corresponding to the con- 
stant pool array of variable length elements, two-byte access flags value 306. two-byte This dass" identifier 307, two- 
byte super class kJentif io- 308, two-byte interfaces count value 309. interfaces table 31 0 corresponcfing to the interfaces 
array of two-byte elements, two-byte f ieWs count value 311. fieWs table 312 corresponding to the fieWs array of variable 
length elements, two-byte methods count value 313. metfiods table 314 corresponding to the methods array of variable 
length elements, two-byte attributes count value 315. and attributes table 316 conesponding to the attributes array of 
variable-length elements. Each of the above structures is briefly desaibed below. 

[0040] Magic value 301 contains a number kJentifying the dass file format For the Java dass fOe format the magic 
number has the value OxCAFEBABE. The minor version numi^er 302 and major version mjmber 303 specify the minor 
and major version nunt)ers of the compiler responsibie for produdng the dass file. 

[0041 ] The constant pool count value 304 identifies the number of entries in constant pool table 305. Constant pool 
table 305 Is a table of variable-length data strudures representing various sti'ing constants, numerical constants, dass 
names. fieM names, and other constants that are refened to within tiie ClassRle structure. Each entry in the constant 
pool table has the following general stmcture: 
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cp_info { 
ul tag; 
ul info[]; 

1 



10 where the one-b/te "tag" specifies a particular constant type. The format of the infoQ an^y differs based on the constant 
type. The infon array may be a numerical value such as for integer and float constants, a string value for a string con- 
stant, or an index to another entry of a different constant type in the constant pool table. Further details on the constant 
pool table structure and constant types are available in Chapter 4 of Appendix A. 

[0042] Access flags value 306 is a mask of modifiers used with class and interface declarations. The "this class" value 
75 307 is an Index into constant pool table 305 to a constant type structure representing the dass or interface defined by 
this dass file. The super dass value 308 is either zero, indicating the dass is a subdass of javaJang.Object. or an index 
into the constant pool table to a constant type structure representing the superclass of the dass defined by this dass 
file. 

[0043] Interfaces count value 309 identifies the number of direct SMperinterfaces of this dass or interface, and acoord- 
20 ingly. the number of elements in interfaces table 310. Interfaces table 310 contains two-byte indices into constant pool 
tat)le 305. Each con'esponding entry in constant pod tattle 305 is a constant type structure representing an interface 
which is a direct superinterface of the dass or interface defined by this dass file. 

[0044] The fields count value 31 1 provides the number of structures in fields table 312. Each entry in fields table 312 
is a variable-length structure providing a description of a field in the dass type. Relds tattle 312 indudes only those 
25 fields that are dedared by the class or interface d^ed by ttus dass fila • 

[0045] The methods count value 313 irKlicates the number of structures in metfKxls table 314. Each element of meth- 
ods table 314 is a variable-length structure giving a description of. and virtual machine code for. a method in the dass 
or interface. 

[0046] The attrikxjtes count value 315 indicates the number of structures in attributes table 316. Each element in 
30 attributes tat>le 316 is a variable-length attribute structure. Attribute structures are discussed in section 4.7 of Appendix 
A. 

[0047] Embodiments of the invention examine the constant pool table for each dass in a set of dasses to determine 
where duplicate information exists. For example, where two or more classes use the same siring constant, the string 
constant may be removed from each class file structure and placed in a sfiared constant pod table. In the simple case. 
35 if N dasses have the same constant entry. N units of memory space are taken up in storage resources. By removing all 
constant entries and providing one shared entry. N-1 units of memory space are freed. The memory savings inaease 
with N. Also, by implementing a shared constant table, entries in the constant table need be fuOy resolved at most once. 
After the initial resolution, future code references to the constant may directiy use the constant 

40 Prei?rocessing and Pactaqinq Clasges 

[0048] An emtxxjiment of the invention uses a dass pre-processor to package dasses in a format called an "mdass" 
or multi-class file. A metftod for pre-processing and packaging a set of dass files is illustrated in the flow diagram of Rg- 
ure 4. 

45 [0049] The method begins in step 400 with a set of ari^itrary dass files "S" (typically part of one application). In step 
401 . the pre-processor reads and parses each dass in "S." In step 402. the pre-processor examines the constant pool 
tatiies of each dass to determine the set of dass file constants (such as strings and numericSi as well as others specific 
to the dass file format) that can be shared between classes in *S." A shared constant pool table is aeated in step 403. 
with all duplicate constants determined from step 402. In step 404. the pre-processor removes the duplicate, shared 

so constants from the individual constant pool tables of each dass. 

[0050] In step 405. tiie pre-processor computes the in-core menwry requirements of each dass in "S." as wouW nor- 
mally be determined by the dass loader for the given virtual machine. Tills is the amount of memory the virtual machine 
would alk)cate for each dass. if it were to load each dass separately. After considering all dasses in "S" and the addi- 
tional memory requirement for the shared constant pod tat)le, the total memory requirement for loading "S" is computed 

55 in step 406. 

[0051 ] In step 407. the pre-processor produces a mutti-dass (mdass) f ae that contains the shared constant pool table 
created in st^ 403, information about memory alkxxition requirements determined in steps 405 and 406, and all 
dasses in "S," with their respedive reduced constant pod tables. The mdass file for ttie dass set "S" is output in step 



7 



EP0913 769A2 

408. (n some embodiments, to further reduce the size of the multi-class file, the multi-class file may be compressed. 
[0052] An example of one embodiment of a multi-class file structure may be represented as follows: 



5 McIassFile { 

u2 shared.pooLcount; 

q>Jnfo shared_pool[shared4>ooLcount-l]; 

u2 inem_alloc.req; 
,0 u2 classfile.count; 

ClassFile classfiles(classfile count]; 

I 



15 

[0053] In one en^odiment of the invention, a new constant type is defined with a corresponding constant type tag. 
The new constant type provides as its infoQ element an index into the shared constant table. During preiirooessing. 
duplicated constant elements are placed In the shared constant pool as a shared element, and an element of «ie new 
constant type replaces the duplicated element in the reduced pool to direct constant resolution to the shared element 
20 in the shared constant pod. Reduction occurs because the replacement element is just a pointer to the actual constant 
placed in the shared constant pod. 

[0054] Rgure 5 is a simplified block diagram d an embodiment d the nutti-class file format. Mclass file 500 oonprises 
shared constant pod table 501 . memory allocation requirements 502 and the set of individual dasses 503. The set of 
individual dasses 503 comprises the dass file structures for dasses 1-N (N being the number of dasses ih fhe set). 

25 along with the corresponding reduced constant pod tatrfes 1-N. The size of the shared constant pod table 501 is 
dependent on the number of dMplicate constants found in the set d classes. The memory allocation requirements 502 
may be represented as a single value indicating the total memory needed to load all dass structures (dasses 1-N) in 
individual dasses 503. as wdl as the shared constant pool table 501. The shared pool count and dassfile count (not 
shown in Figure 5) identify the number of elements in the shared constant pool table 501 and the classfiles anay of 

30 ClassFile structures (represented by dasses 503), respectively. 

[0055] The multi-dass file is typically considerably smaller than the sum of the sizes of the individual dass ffles that 
it was derived from. It can be loaded by the virtual machine during or prior to the execution of an application, instead of 
having to load each contained dass on demand. The virtual machine is also able to take advantage of the allocation 
requirements infbrmatwn to pre-allocate all required memory for the muHI-dass set. This solves many of the problems 

35 associated with dass k)ading. 

[0056] Classes In a multi-dass set share information between dasses^ and therefore are smaller. This provides the 
following advantages: 

a) the classes take up less space on servers or storage devices; 
<o b) the classes take less networ1< or file transfer time to read; 

c) the dasses take up less memory when loaded; and 

d) execution is taster, since shared constants are resolved at most once. 

[0057] Multi-class sets consolidate the loading of required dasses instead of loading the classes one by one Using 
45 allocation information, only one dynamic memay altocation Is needed instead of multiple allocation operations. This 
results in less fragmentation, less time spent in the allocator, and less waste of memory space. 
[0058] Because the dass files are consdidated in a single multi-dass fie, only a single transaction is needed to per- 
form a network or file system search, to set ip a transfer session (e.g., HTTP) and to transfer the entire set of classes. 
This minimizes pauses in the execution that can result from such transactions and provkles for deterministic execution, 
50 with no pauses for dass foading during a program run. Also, once the miili-class file is foaddd and parsed, there is no 
need for the computer executing the program to remain conneded to the source of the dasses. 
[0059] Rgure 6 illustrates the runtime data areas of the virtual madwie when a nutlti-dass ffle is processed and 
loaded in accordance with an emtxxliment of the invention. In Figure 6. runtime data areas 600 comprise multiple pro- 
gram counter registers (PC REG 1 -M) and multiple stacks 1 -M. One program counter register and one stack are alio- 
55 cated to each thread executing in the virtual machine. Each program counter register contains the address of the virtoal 
machine instruction for the current method being executed by the respective thread. The stacks are used by ftie respec- 
tive threads to store tocal variables, partial results and an operand stack. 

[0060] Runtime data areas 600 further comprise heap 601 , which oontate method area 602. Heap 601 is the nrtime 
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data area from which memory for all dass instances and an^ays is allocated. Method area 602 is shared among all 
threads, and stores class structures such as the constant pod. field and method data, and the code for methods. Within 
method area 602. memory block 603. which may or may not be contiguous, is allocated to the multi-dass set of classes 
"S." Other regions in heap 601 may be allocated to **S" as well. Reduced constant pools 1 -N. along with shared constant 

5 pool 604, reside within block 603. 

[0061] Due to the removal of redundant constants in accordance with an embodiment of the invention, the size of 
fc>lock 603 required to contain reduced constant pools 1 -N and shared constant pool 604 is much smaller than would be 
required to accommodate constant pools 1 -N, were they not reduced. Also, the allocations in block 603 are much less 
fragmented (and may be fburKi in contiguous memory) than the memory that would be allocated were the dasses to be 

10 loaded one by one, 

[0062] Thus, a method and apparatus for preisrocessing and packaging dass files has been descrit>ed in conjunction 
with one or more specific embodiments. The inventbn is defined t3y the claims and their full scope of equivalents. 
[0063] The features disdosed in the foregoing description, in the claims and/or in the accompanying drawings may, 
both separately and in any combination thereof, be material for realising the invention in diverse forms thereof. 
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APPENDIX A mJava^^ Vlmuil Machme Specifu 



CHAPTER 4 

The class File Format 



This chapter describes the Java Virtual Machine class file format. Each class file contains one Java type, 
either a class or an interface. Compliant Java Virtual Machine implementations musi be capable of dealing with 
all class files that conform to the specification provided by this book. 

A class file consists of a stream of 8-bit bytes. All 16-bit, 32-bit. and 64-bit quandiies arc constructed by 
reading in two. four, and eight consecutive 8-bit bytes, respectively. Multibytc data items are always stored in 
big-endian order, where the high bytes come fint In Java, this format is supponed by inter-faces 
java.io.Dacalnput and java. io.OacaOucput and dasscs SUch as java.io.DatalnputStr earn and 
java . io . DacaOutpu tScreaot 

This chapter defines its own set of data types representing Java class file data: The types ui, u2, and u4 
represent an unsigned one-, two-, or four-byte quantity, respectively. In Java, these types may be read by 
methods such as readUnsignedByte, readUnsignedShcrt. and readinc of the interface 
java . io . I>acalnpuc. 

The Java class file format is presented using pseudostniciures written in a C-like structure notation. To avoid 
confusion With the fields of Java Virtual Machine classes and class instances, the contents of die structures 
describing the Java class file format are referred to as items . Unlike the fields of a C structure, successive 
items are stored in the Java class file sequentially, without padding or alignment 

^ 

Variable-sized tables . consisting of variable-sized items, arc used in several class file structures. Although 
we will use C-likc array syntax to refer to table items, the fact that tables are streams of varying-sized 
structures means that it is not possible to directly translate a table index into a byte ofifsct into the table. 

Where we refer to a data structure as an array, it is literally an array. 



4,1 ClassFile 

A class file contains a single class File stnicwre: 

ClassFile ( 

u4 magic; 

u2 minor_version; 

u2 major_version; 

u2 con^tanc^pool^count; 

cp_info conscanc_pcol (constant j>ool_counc-l J ; 

u2 access_f lags; 

u2 Chis^class;. 

u2 super_class; 

u2 inter faces^count; 

u2 interfaces (interfaces.countl ; 
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u2 Cields_count; 

eieId_info f ields (fields.counc) ; 
u2 methods_count; " 
mechod_info roechods (methods^councl j 
u2 accribuces.counc; 

accribuce.info accributes(accribuces counc]; 

) 

The items in the ciassFiie scnicture arc as follows: 
magic 

The magic item supplies die magic number identifying the class file fonnat; it has the value 

OxCAFEBABC. 

minor_version, major^version 

The values of the minor^version and major_version items are the minor and major version 
numbers of the compiler (hat produced this class file. An implementation of the Java Virtual. 
Machine normally suppoits class files having a given major version number and minor 
version nupibers o throu^ some particular mlnor.vers ion. 

If an implementation of the Java Virtual Machine'siipports some range of minor version 
numbers and a class file of the same major version but a higher minor version is 
encountered, the Java Vinual Machine must not attempt to run die newer code. However. 
• unless the major version number differs, it wiD be feasible to implement a new Java Virtual 
Machine that can mn code of minor versions up to and including that of the newer code. 

A Java Virtual Machine must not attempt to run code widi a different major vcraion. A change 
of the major version number indicates a major incompatible change, one that requires a 
fundamentally different Java Virtual Machine. 

In Sun's Java Developer's Kit (JDK) 1,0.2 release, documented by this boot die value of 
maj ©reversion is 45. The valuc of oinor.version is 3. Only Sun may define the meaning 
of new class file version numbers. 

constant_pooLcount 

The value of the cons tant_pool_counc item must be greater than zero. It gives the number 
of entries in the constanc_pool table of the class file, where the conscanc_pool cnuy at 
index zero is included in the count but is not present in the cons can c_pool table of the class 
file. A cons tan t^ool index is considered valid if it is greater than zero and less than 
conscant^ool^counc. 

constant_pool(] 

The constant .pool is a table of variable -length structures representing various string 
constants, class names, field names, and other constants diat are referred to within the 
ciassFiie Structure and its substructures. 

The first entry of the conscant^pool table. constant_pool (0) , is reserved for internal use 
by a Java Virtual Machine implementation. That entry is not present in die class file. The 
first entry in the class fik is conscanc^pool 111. 

Each of the constancpool table entries at indices i through constant.pool_count-l is a 
variable-length smiciurc whose format is indicated by its first "tag" byte. 
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java. lany.Objecc. (he only class or interface without a superclass. 

For an interface, the value of super.class must always be a valid index into the 
cons can ciMol table. The constanc_pooi entry at that index must be a 
coNSTANT.ciass.info Structure representing the class java.iang. object. 

interfaces_count 

The value of the inter faces_counc item gives the number of direct superinterfaccs of this 
class or interface type. 

interfacesf] 

Each value in the interfaces array must be a valid index into the conscant^ool cable. The 
conscant_pool entry at each value of interfacesli 1 , where 0 £ i < inter faces_counc, 
must be a cowsTANTjciass.inf o ($4.4.1^ stnictuie representing an interface which is a 
direct superinterface of this class or interface type, in the IcftKo-f ight order given in the source 
for the type. 

fields_count 

The value of the f ieids_counc item gives the number of f ield_inf o structures in the 
fields table. The f ield_inf o (^4.5^ Structures represent all fields, both class variables and 
instance variables, declared by this class or interface type. 

rields[] 

Each value in the fields table must be a variable-length f ield.inf o IMJl structure giving 
a complete description of a field m the class or interface type. The fields table includes only 
those fields that are declared by this class or interface. It does not include items representing 
fields that arc inherited from superclasses or superinterfaccs. 

methods_count 

The value of the methods .count item gives the number of method^info structures in the 
methods table. 

methods[] 

Each value in the methods table must be a variable-length roechod_in£o (§4.6) sirucmrc 
giving a complete description of and Java Virtual Machine code for a method in the class or 
interface. 

The method.inf o Structures represent all methods, both instance methods and. for classes, 
class (static) methods, declared by this class or interface type. The methods table only 
includes those methods that are explicitly declared by diis class. Interfaces have only the 
single method <clinic>, the interface initialization method (5181. The methods table does 
not include items representing methods that are inherited from superclasses or superimerfaces. 

attribute^^count 

The value of the actributes.count item gives the number of attributes (^4.7) in the 
attributes uble of this class. 

attributesH 
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access.flags 

The value of the access.Clags item is a mask of modifiers used with class and interface 
declarations. The access.£lags modifiers are shown in Table 4. 1. 



Flag Name 


Value 


Meaning 


Used By 


ACC^PUBLIC 


0x0001 


Is public; may be accessed from outside its package. 


Class, interface 


ACC.FINAL 


0x0010 


Is final; no subclasses allowed. 


Class 


ACC_SUPER 


0x0020 


Treat superclass methods specially in invokespecial. 


Class, interface 


ACC.INTERFACE 


0x0200 


Is an interface. 


Interface 


ACC_ABSTRACT 


0x0400 


Is abstract; may not be instantiated. 


Class, interface 



An interface is distinguished by its acc.interface flag being set. If acc_interface is not 
set, this class file defines a class, not an interface. 

Interfaces may only use flags indicated in Table 4.1 as used by interfaces. Classes may only 
use flags indicated in Table 4.1 as used by classes. An interface is implicitly abstract 
($2. 13. lim its ACCj^STRACT flag must be set An interface cannot be final; its 
implementation could never be conqdeted f^2.13.n if it were, so it could not have its 

ACC_FINAL flag SCI. . 

The flags acc_final and acc^abstract cannot both be set for a class; the implementation of 
such a class could never be completed f^2.8.2y 

The setting of Lhc acc_super flag directs the Java Virtual Machine which of two alternative 
semantics for its invokespecial instnjction to express; it exists for backwarci compatibility for 
code compiled by Sun's older Java compilers. All new implementations of the Java Virtual 
Machine should implement the semandcs for invokespecial documented in Chapter 6, "Java 
Virtual Machine Instruction Set. " All new compilers to the Java Virtual Machine's instruction 
set should set the acc_su?er flag. Sun's older Java compilers generate classFile flags with 
ACC.suPER unscL Sun's older Java Virtual Machine implementations ignore die flag if it is 
set. 

All unused bits of the access.f lags item, including those not assigned in T^blg^ L axe 
reserved for future use. They should be set to zero in generated class files and should be 
ignored by Java Virtual Machine implementations. 

this.class 

The value of the this_class item must be a valid index into the constant^ool table. Tne 
conscanc_pool entry at that index must be acoNST.\.VT_class_in£o (H^ U structure 
representing the class or interface defined by this class file. 

super_class 

For a class, the value of the super_class item cither must be zero or must be a valid index 
into the cijnstanc^pool table. If the value of the super.class item is nonzero, the 
constant _pooi entry at that index must be a coNSTANT^ciass.info f^^.^ U structure 
representing the superclass of the class defined by this class file.. Neither the superclass nor 
any of its superclasses may be a final class. 

If the value of super.class is zero, then this class file muse represent the class 
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Each value of the accribuces cable must be a variable- kngch attribute structure. A 
class File siAiciurc can have any number of attributes (^4.7^ associated with it. 

The only attribute defined by this speciftcation for the attritauces table of a ciassrile 
structure is the SourceFile attribute fS4.7.2V 

A Java Virtual Machine implementation is required to silently ignore any or all attributes in the 
ac tributes table of a Class File stmccure that itdoes not recognize. Attributes not dcfmcd 
in this specification arc not allowed to affect the semantics of the class file, but only to 
provide additional descriptive infonnation f^4.7.n . 



4.2 Internal Form of Fully Qualified Class Names 

Class names that appear in class file structures are always represeaced in a fully qualified form f§2.7.y). 
These class names aie always represented as C0NSTXNT_ut£8_info fS4.4.7^ sttuccurcs. and they arc 
referenced from those coNSTANT^NaineAndType^info strucnircs that have class names as part of thei 

descriptor f$4.3V as well as from all coNSTA»T_ciass.iA£o f^4.4.l) structures. 

For historical rea^ns che exact syntax of My qualified class names that appear in class file structures differ, 
from the familiar Java fully qualified class name documented in §223. In internal form, the ASCII perioc 
( • . ' ) that normally separate the identifiers (^22) that make up the fully qualified name are replaced by ASCII 
forward slashes ( v )• For example, the normal fully qualified name of class Thread is j ava . lang . tbread. 
In the form used in descriptors in clas s files, a reference to the name of class Thread is implemented using a 
coNSTANT^tJtf s.info Structure representing the string • java/lang/rhread". 



4,3 Descriptors 

A descriptor is a string representing the type of a field or method. 

4.3.1 Grammar Notation 

Descriptors arc specified using a grammar. This grammar is a set of productions that describe how sequences 
of characters can form syntactically correct descriptors of various types. Terminal symbols of the grammar are 
shown in bold fixed-width font. Nonterminal symbols are shown in mlic type. The dcfimucn of a nontcrmm* 
is introduced by the name of the nonterminal being dcfmed, followed by a colon. One or more . 
right-hand sides for die nonterminal then follow on succeeding lines. A nontcmiinal symbol on the ngfat-hanc 
side of a production that is followed by an asterisk (*) represents zero or more possibly different values 
produced from that nonterminal, appended without any intervening space. 

4.3.2 Field Descriptors 

Afield descriptor represents the type of a class or instance variable. It is a series of characters generated by it 
grammar: 

FUldDescriptor: 

FieUTType 

Comport en tType: 
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FieldType 
FieidType: 

BaseType 

ObjecsType 

ArrayType 
BaseType: 

B 

C 

D 

F 

I 

J 

S 

2 

ObjectType: 

L <classDame> ; 
ArrayType: 

I ComponentType 

The characters of BaseType , the L and ; of ObjectType , and the ( of ArrayType are all ASCII characters. The 
<classname> represents a fuUy qualified class name, for instance, java. lang .Thread. For historical reasons 
it is stored in a class file in a modified internal form (5£21. 

The* meaning of the field types is as follows: 

B 
C 
0 
F 
I 
J 

L<classnaine>; 
S 

Z 
I 

For example, the descriptor of an int instance variable is simply L The descriptor of an instance variable of 
type Object is Ljava/lang/Objcct;, Note that the internal form of the fully qualified class name for class objec 
IS used. The descriptor of an instance variable that is a mulddimensional double array. 



byte 


signed byte 




char 


character 




double 


double -precis ion IEEE 7S4 


float 


float 


single-precision IEEE 754 


float 


int 


integer 




long 


long integer 




shore 


an instance of the class 




signed short 




boolean 


true or false 

one array dimension 
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double d(I 0(1; 

is 

CC(D 

4.3.3 Method Descriptors 

A parameter descriptor represents a parameter passed to a method: 
ParameterDescriptor: 
FieldType 

A method descriptor represents the parameters that the method takes and the value that it Fecums: 

MethodDescriptor: 

( ParameterDescriptor * ) RetumDescriptor 

A return descriptor represents the return value from a method It is a series of characters generated by the 
grammar 

RetumDescriptor: 

FieldType 

V 

The character V indicates that the method returns no value (its return type is void). Otherwise, die descriptor 

indicates the type of the return value. 

A valid Java method descriptor must represent 255 or fewer words of method parameters, where that limit 
includes the word for this in the case of instance method invocations. The limit is on the number of words of 
method parameters and not on the number of parameters themselves; parameters of type long and double each 
use two words. 

For example, the method descriptor for the method 

Object Riymechod ( int i* double d. Thread t] 

is 

( IDL j ava /I ang /Thread; ) Ljava/Iang/ Object; 

Note that internal forms of the fully qualified class names of Thread andobjec:: are used in the nKthod 
40 descriptor. 

The method descriptor for mymechod is the same whether mymethod is static or is an instance method. 
Although an instance method is passed this, a reference to the current class instance, in addition to its 
intended parameters, that fact is not reflected in the method descriptor. (A reference to this is not passed to a 
static method.) The reference to this is passed implicitly by the method invocation instructions of the Java 
45 Vinual Machine used to invoke instance methods. 
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4.4 Constant Pool 

All conscant_pool tabic entries have ihe following generaJ format: 

cp^info { 

. ul tag; 
ul infoO; 

) 

Each item in the constanc_pool table must begin with a 1-byie tag indicating the kind of cp_inf o entry. The 
contents of the info array varies with the value of tag. The valid tags and their values are listed in Table 4.2 



Constant Type Value 
CONSTAKT_Class 7 
CONSTANT_Fieldref 9 
CONSTANT^Methodref 10 
CONSTANT_IncerfaceMethodref 11 
CONSTANT_String 8 
SO CONSTANT_Integer 3 

CONSTANT^Float 4 
CONSTANT_Long 5 
CONSTANT_Double 6 
CONSTANT^NameAndType 12 , 

C0NSTANT_Utf8 1 

. Each lag byte must be followed by two or more bytes giving information about the specific constant The 
format of the additional information varies with the tag value. 



4.4.1 CONSTANT_Class 

The coNSTANT_ciass_inf o Structure is used to represent a class or an interface: 

CONSTANT_Class_info ( 
ul tag; 

u2 nane.index; 

) 

The items of the C0NSTANT_ciass_inf o stnicturc arc the following: 
tag 

The tag item has the value coNSTANT^ciass (7), 
name_index 

The value of the name_index item must be a valid index into the conscant_pool table. The 
constant_pool entry at that index must be acoNSTANT_utf8_info (^4.4.7> structure 
representing a valid fully qualified Java class name f $2.8.1) that has been conveaed to the 
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20 
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class file's internal form (H.2) . 

Because airays arc objects, the opcodes anewarray and muttianewarray can reference array "classes** via 
coNSTANT.ciass.info (§4.4. t) structures in the conscant_pool table. In this case, the name of the class is 
the dcscnptor of the array type. For example, the class name representing a two-dimensional inc array type; 

incDd 



(ti 

The class name representing the type array of class Thread; 
Thread (] 

is 

C L j ava . lang . Thread ; 
A valid Java anray type descriptor must have 255 or fewer array dimensions. 

4.4-2 CONSTANT^Fieldref, CONSTANT_Methodref, and 
CONSTANT JnterfaceMethodref 

Fields, methods, and interface methods are represented by similar structures: 

^ CONSTANT_Fieldref_info ( 

ul tag; 

u2 class_index; 

Ml nanie_anc_cype_index; 

) 

30 CONSTANT_Methocref_info { 

ul cag; 

u2 class_index; 

u2 naine_and_type_index; 

) 

CONSTAMT_IncerfaceMethodref_info { 
ul tag; 

u2 class^index; 
u2 naine_and_type_index; 

) 



The items of these structures are as follows: 
tag 

The tag item of a C0NSTANT_Fieldref_info Structure has the value constant Fieidref 

(9) . 

The cag item of a cONSTANT_Methodref_info Structure has the value constant Mechodref 

(10) . 

The cag item of a coNSTANT_interfaceMechodref info Structure has the value 
CONSTANT.IncerfaceMechodref (ll). 
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cinss.index 

The value of the class_index item musi be a valid index into the cons cant_pooi table. The 
5 constanc^pool entry at that index must be a C0HSTANT_ciass_info ($4.4. M structure 

representing the class or interface type that contains the decianuion of the Held or method. 

The class.index item of a CONSTANT_Fieldre£_info or a CONSTANT_Methodref_in£o 

Structure must be a class type, not an interface type. The class^index item of a 
C0NSTANT_incerfaceMethodxef_inf o Structure must bc an interface type that declares the 
'0 given method. 

name_and_lypejndex 



75 



The value of the name_and_cyp€_index item musi be a valid index into the constant^pooi 

tabic. The constar.c_pool entry at that index must be a CONSTANT_NaineAndType_info 

(H4.6) stiucturc. This constant^ol cnuy indicates the name and descriptor of the field 
or method. 



If the name of the method of a coNSTANT_Methodref _inf o or 

coNSTANT_interf aceMethodref.inf o begins with a * < ' ( • uOOac * ), then the name must be 
2Q one of the special internal methods f 63.8V either <init> or <clinit>. In this case, the 

method must return no value. 

4.43 CONSTANTJString 

The coNSTANT_string_inf o simcniTC is uscd to represcm constant objects of the type java . laag . string: 

25 

CONSTANT_StringL.info ( 
ul tag; 

u2 string_index; 

30 

The items of the C0NSTANT_string_inf o structure are as follows: 
tag 

The tag item of the CONSTANT_String_info Structure has the value CONSTANT.String (8). 

35 

stringjndex 

The vaiue of the scring_index item must be a valid index into the cons tant_pool table. 
The constant^ool entry at that index must be a constant utf8 info (^4.4.3) stmcnire 
representing the sequence of characters to which the java . lang . string object is to be 
40 initialized. 

4.4.4 CONSTANTJnteger and CONSTANTJloat 

The ccNSTANT_inceger_info and CONST ANT_Fioac_in£osiructures represent four-byie numeric (int and 



45 



so 



£loat} constants: 

CONSTANT_Integer_info ( 

ul tag; 
u4 bytes; 

) 




19 



i 



EP0 913 769 A2 



CONSTANT_Floac„info { 
ul cag; 
u4 byces: 



The items of (hese structures are as follows: 
tag 

The tag item of the CONSTANT.Integer.info Structure has the value coNSTANT.Inceger 
(3). 

The tag item of the CONSTANT-Floac_inf o Structure has the value cxwsTANT.Float (4). 
bytes 

The bytes item of the coNSTANT_integer_info Structure contains the value of the int 
constant The bytes of the value are stored in big-endian (high byte first) order. 

The bytes item of the cONSTMiT_Fioat_in£o Structure contains the value of the float 
constant in IEEE 754 floating-point •'single format" bit layout The bytes of the value are 
stored in big-endian (high byte first) order, and arc fkst coaveited into an int argument 
Then: 

• If the argument is 0x7 f 800000. the float value will be positive inGnlty. 

• If die argument is Oxf £800000. the float value will be negative inflnl^. 

• If the argument is in the range 0x7f sooooi through 0x7 f ££££f £ or in the range 
Oxz£80000i through oxfffffff £, the float value will be NaN. 

• In all other cases, let s, e, and m be three values that might be computed by 

int s = ((bytes » 31J ==0) ? 1 : -1; 
int e = ((bytes » 23) & Oxff); 
30 iat n = (e 0) ? 

(bytes & 0x7fffff) « 1 : 
(bytes & 0x7££f££} | 0x800000; 

Then the float value equals the result of die mathematical expression 



4.4,5 CONSTANT^Long and CONSTANT JDouble 

40 

The coNSTANT_Long_info and coNSTANT.i>ouble_in£o represent eight-byte numeric (long and double) 
constants: 

CONSTANT.Long^info ( 

ul tag; 

u4 high_bytes; 

u4 low^bytes; 

) 

CONSTANT_Double_info ( 
ul tag; 

SO u4 high.bytes; 
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u4 low_byces; 

) 

All cighc-bytc constants take up nvo entries in ihc conscanc_pooi tabic of the class file, as well as in (he 
^ in-mcmory version of the consiani pool that is consinictcd when a class file is read. If a 

C0NSTANT_Lon9_info Of coMSTANT_pQubX«_inio Structure is the itcm in the coa«tant_pool tabic at index 
n. then ihc next vaJid itcra in the pool is located at index n+2. The con»cant^ooX index n+l must be 
considered invalid and must not be used.-!- 

The items of these structuxes are as follows: 

tag 

The ta? item of the cot9STAiRLZ#ong.in£o structure has the value constant^ jion9 (s). 
1^ The bA9 item of the cOKSTAMTjDouble.info stnicture has the value constant j>o\ibi e (6). 

high_bytes, low_byies 

The unsigned higiOyces and low_bytes items of the coNSTANT^Long structure together 
contain the value of the long constant ((long)higiO)ytes « 32) + low^Jaytes, where the 
bytes of each of higiobyces and low.tayces art stored in big-endian (high byte fiisi) order. 
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The high^byce g and iow_bytes itcms of the coNSTANT«poublc_iii£o Structure contain the 
double value in IEEE 754 Qoating-point Mouble fomiat'* bil layout The bytes of each itcm 
arc stored in big-endian (high byte first) order. The highJvces and lowjsytes items arc 
first convened into a long argument Then: 

# If the argument is 0x7f eooooooooooooOL. the double value will be positive infinity. 

# If the argument is Oxf f boooooooooooool, the doubl« value will be negative 

infinity. 

# If the argument is in the range Ox7f fooooooooooooiL through 

Ox7f f f fffff f f f f f £fL or in the range Oxf f f ooooooooooooiL through 
Oxf tf f £ff f f f f f f f Cf L, the double value will be NaN. 

# In all other cases, let s, e, and q be three values that might be computed from the 
argument 

inc 5 m ((bits » 63) 0) ?.l : -1; 
inc e = (inc) ((bits » S2) 0x7££t) ; 
long roe (e O) ? 

(bits & OxIffffffff££££L) « 1 : 

(bits & Oxff fffCfff ffffL) I OxlOOOOOOOOOOOOOt; 

Then the fIoating*point value equals the double value of die mathematical expression 



4.4.6 CONSTANT J^ameAndType 

45 The cx)NSTANT_NaxneAndType_inf o stnicture is used to represent a field or method, without indicating which 
class or interface type it belongs to: 

CONSTANT_NBin€AndType_in£o ( 
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ul tag; 

u2 naiae.lndex: 

u2 descripcor.index; 

) 

The items of ihc cOMSTANT^jtAmeAndType^iifo stniccurt are as follows: 
tag 

The cag item of the coNSTANT_N«meAndTypc_inlo s(nicture has ihc value 

C0NSTANT_NameAiwl1Vpe (l 2 ). 

name.index 

The value of the name^lndex item mvsi be a valid index into the constanc.^ool table. The 
constanc jwol entry at that index must be a C0btSTANT_utf 8_inf o f $4.4.7^ stnicturc 
representing a valid Java field name or method name ($2.7^ stored as a simple (not fully 
qualified) name ($2.7. IV that is. as a Java identifier. 

descriptorjndex 

The value of the descripcor^index item must be a valid index into the const«ntj;>ool 
Uble. The conscantLpool entxy at that index must be a CQNSTANT_oe£8_ln£o f^.4.71 
structure representing a valid Java field desciiptor fM.3.21 or method descriptor £§4.2^. 

4.4.7 CONSTANTJJtfS 

The coNSTAKT.vcf a^lnfo stnicmrc is used to represent constant stdng Vjaloes. 

UTF-8 strings are encoded so that character sequences thai contain only non-^uil ASCII characters can be 
represented using only one byte per character, but chaiactcrs of up to 16 bits can be represented. All characteis 
in the range ' uOOOl * to * uOqtf * axe represented by a single byte: 

0 bits 0-7 



The seven bits of data in the byte give the vahie of the characcer represented. The null cheiacter ( * uo ooo * ) and 
characten in the range • uooeo ' to • u07ff' arc represented by a pair of bytes x and y: 

x: 1 1 0 bits 6-10 y: I 0 bits 0-5 



The bytes represent the character with die value {(x & Oxif) « «) + (y & 0x3f). 
Characters in the range 'uoeoo * to *uff?f- axe represented by three bytes x, y, and z: 
x: 1 L I 0 bits 12-15 y: 1 0 bits 6-11 r 1 0 bits 0-5 



The character with the value ((x & Oxf ) « 12) + ({y & 0x3 f ) « 6) + (z & 0x3 c)is repiescnted by the bytes 

The bytes of multibyte characters are stored in die class file in big-endian (high byte first) order. 

There are two differences between diis format and the "standard" UTF-8 fornuL First, die null byte ibycO 0 



22 



10 



15 



20 



25 



30 



35 



40 



45 



SO 



55 



EP0913 769A2 



is encoded using the two-byic formal raihcr than ihc onc-byic fonnat, so that Java Vinual Machine UTF-S 
strings never have embedded nulls. Second, only the one-byte, two-byte, and three-byte formats arc used. The 
Java Vinual Machine does not recognize the longer UTF-8 formats. 

For more information regarding the UTF-8 formal, sec File System Safe UCS Transformation Format 
(FSS^UTF) , X/Opcn Preliminary Specification. X/Opcn Company Ltd.. Document Number P316. This 
information also appears in ISO/EEC 10646. Annex P. 

The CONSTANT^Ucf 8_info SlTUClUrC is 
CONSTANT_Utf8_info { 

ul tag; 

u2 length; 

ul bytes (length 3 ; 

} 

The items of the cONSTANT.utf 8_inf o stnicmre are the following: 
tag 

The tag item of the cc»JSTANT_ut£8_inf o sirocture has the value coNSTANT^utf 8 (i). 
length 

The value of the length item gives the number of bytes in the bytes array (not the length of 
the resulting siring). Tbt strings in the coNSTMiT_utf 8_inf o structure are not 
null-terminated. 

bytes[] 

The bytes array contains the bytes of the string. No byte may have the value ibyte) o or 
(byte) Oxf 0- (byte ) Oxf f . 



4.5 Fields 

Each field is described by a variable-length f ield.info structure. The formal of this structure is 

field_info { 

u2 access_f lags ; 
u2 naune_index; 
u2 descriptor_index; 
u2 attributes_count; 

attribute.info attributes (attributes^count) ; 

) 

The items of the f ield.inf o structure are as follows: 
access_flags 

The value of the access.eiags item is a mask of modifiers used to describe access 
permission to and properties of a field. The access.f lags modifiers are shown in Table 4.3. 
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Flag Niimc 


Value 


Meaning 


Used By 


ACC^PUBLIC 


0x0001 


Is public; may be accessed from outside its package. 


Any field 


ACC^PRIVATE 


0x0002 


Is private; usable only within the defming class. 


Class field 


ACC_PROTECTED 


0x0904 


Is proceeded; may be accessed within subclasses. 


Class field 


ACC_STATIC 


0x0008 


Is scatic. 


Any field 


ACC_FINAL 


0x0010 


Is final; no further overriding or assignment after 
initialization. 


Any field 


ACC_VOLATILE 


0x0040 


Is volaciie; cannot be cached. 


Class field 


ACC.TRANSIENT 


0x0080 


Is transient; not written or read by a persistent object 
manager. 


Class field 



Fields of interfaces may only use flags indicated in Table 4.3 as used by any field. Fields of 
classes may use any of the flags in Table 4.3 . 

All unused bits of the access^f lags item, including those not assigned in Table 4.3. are 
reserved for future use. They should be set to zero in generated class files and should be 
ignored by Java Virtual Machine implementations. 

Class fields may have at most one of flags ACC.PtJBLic, acc_protected, and acc ^private 
set f$2.7.8V A class field may not have both acc.final and acc_volatile set f $2.9.1) . 

Each interface field is implicitly static and final ($2.13.41 and must have both its 
ACC_STATic and acc^final flags scL Each interface field is implicitly public (^2 A3 A) and 
must have its acc_public flag set. 

name_index 

Tne value of the na5\e_index item must be a valid index into the cons tan c_pool table. The 
cons cant_pooi entry at that index must be a C0NST.wr_utf 8_inf o ($4.4.7) structure which 
must represent a valid Java field name ($2.7) stored as a simple (not fully qualified) name 
($2.7.1). that is, as a Java identifier. 

descriptor.index 

The value of the descriptor^index item must be a valid index into the constant j)ool 
table. The constant ^ool entry at that index must be a coNSTANT.utf B ($4.4.7) structure 
which must represent a valid Java field descriptor ($4.3.2) . 

attributes_count 

The value of the attributes.count item indicates the number of additional attributes ($4.7) 
of this field. 

attributes[] 

Each value of the attributes table must be a variable-length attribute structure. A field can 
have any number of attributes ($4.7) associated with it. 

The only attribute defined for the attributes table of a field_info structure by this 
specification is the Cons tan tvalue attribute ($4.7.3) . 

A Java Vinual Machine implementation must recognize constantvaiue attributes in the 
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attributes table of a f ieid_in£o Structure. A Java Vinual Machine implementation is 
required to silently ignore any or all other attributes in the attributes table that it does not 
recognize. Attributes not defined in this specification are not allowed to affect the semantics of 
5 the class file, but only to provide additional descriptive information (S4.7.n . 
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4.6 Methods 

Each method, and each instance initialization method <init>, is described by a variable-length metho4_inf o 
structure. The structure has the following format: 

method^info ( 

u2 access_£lags; 
u2 name.index; 
u2 descriptor.index; 
u2 attributes.count; 

actribute^info attributes (attributes.count] ; 

) 

The items of the inetho4.iaf o structure are as follows: 
access.flags 

The value of the access.f lags item is a mask of modifiers used to describe access 
permission to and properties of a method or instance initialization method ($3.81 The 
access.flags modifiers are shown in Table 4.4. 
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Flag Name 

ACC_?U3LIC 
ACC.PRIVATE 

ACC_PROTECTED 

ACC_STATIC . 
ACC_FINAL 



Value Meaning 

Is public; may be accessed from outside its 
package. 

^^^-^ Is private; usable only within the defining 
^""^^^^ class. 

« Is protected; may be accessed within 
subclasses. 

0x0008 Is static. 

0x0 010 Is £ inal ; no overriding is allowed. 



ACC^SYNCHRONiZED 0x0020 Is synchronized; wrap use in monitor lock. 

« Is native; implemented in a language other 

ACC.NATIVE 0x0100 ^^^^ j^^^^ ^ 

ACC_ABSTRACT 0x0400 Is abstract; no implementation is provided. 



Used By 
Any method 

Class/instance niediod 

Class/instance method 

Class/instance method 
Class/instance method 
Class/instance method 

Class/instance method 

Any method 



45 



Methods in interfaces may only use flags indicated in Table 4.4 as used by any method. Class 
and instance methods f $2.10.3^ may use any of the flags in Table 4.4 . Instance initialization 
methods fS3.8) naay only use acc^public, acc.protected, and acc^private. 

All unused bits of the access_f lags item, including those not assigned in Table 4.4 , are 
reserved for future use. They should be set to zero in generated class files and should be 
ignored by Java Vinual Machine impleniemations. 

At most one of the flags acc.public. acc_protected. and acc.private may be set for any 
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method. Class and instance methods may not use acc^abstract together with acc.final, 
ACC.NATIVE, or ACC.SYNCHRONiZED (that is. nacive and synchronized methods rcquirc an 
implementation). A class or instance method may not use acc_private with acc_abstract 
(that is, a private method cannot be overridden, so such a method could never be 
implemented or used). A class or instance method may not use acc.static with 
ACC_AflSTRACT (that is. a static method is implicitly final and thus cannot be overridden, 
so such a method could never be implemented or used). 

Class and interface initialization methods (iiSi, that is. methods named <ciinit>. are called 
implicitly by the Java Virtual Machine; the value of their access_f lags item is ignored. 

Each interface method is implicidy abstract, and so must have its acc_abstract flag set. 
Each interface method is implicitly public ($2.13.5). and so must have its acc pitblic Qas 
set. 

name.index 

The value of the name_index item must be a valid index into the constant^pool table. The 
constant_pooi entry at that index must be a coNSTAur^utf 8_inf o ($4.4.T> structure 
representing either one of the special internal method names (^3.8). either <init> or 
<ciini t>, or a valid Java method name t§2i21. stored as a simple (not fuJly qualified) name 

descriptor.index 

The value of the descriptor^index item must be a valid index into the constant_pool 
table. The conscancpooi entry at that index must be a ccNSTAOT^utf 8_info (64.4.7) 
structure representing a valid Java method descriptor f§4.3.3) . 

attributes.count 

The value of the actributes_count item indicates the number of additional attributes (54.7) 
of this method. 

attributes[] 

Each value of the attributes table must be a variable-length attribute structure. A method 
can have any number of optional atuibutes ($4.7) associ^^ with it 

The only attributes defined by this specificauon for the attributes table of a method_inf o 
structure are the code ($4.7.4) and Exceptions (S4.7.5) attributes. 

A Java Virtual Machine implementation must recognize code (^4.1 A) and Exceptions 
(^4.1.5) attributes. A Java Virtual Machine implementation is required to silendy ignore any 
or all other atmbutcs in the attributes table of a method.inf o structure diat it docs not 
recognize. Attributes not defmed in this specification are not allowed to affect the semantics of 
the class file, but only to provide additional descriptive information «>4.7.n . 



4.7 Attributes 

Attributes arc used in the ClassFile (^4.1) . field^info (^4.5) . method info (^4.6). and Code.at tribute 
^^^•7.4) structures of the class file format. All attributes have the following general format: 
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accribuce.info { 

u2 atcribuce^nanie^index; 
u4 attribuce_length; 
^ ul info(attribuce_lengthJ ; 

) 

For aU attributes, the actribute_nanie.index must be a valid unsigned l6-bii index into the constant pool of 
Ihc class. The constancjool entry at attribute^name.index must be a CONSTANT_utf 8 ($4.4. 7^ string 
representing the name of the annbuie. The value of the atcribuce^length item indicates the Icneth of the 
10 subsequent mformauon m bytes. The length does not include the initial six bytes that contain the ^ 

eccribuce_naine_index and actribute_lengch items. 

Certain attributes arc |wcdefincd as part of the class file specification. The predefined attributes are the 
ff."?*"^^^^! £MJ:21. Constancvalue (§4.7.3), Code (^4.7.4V Exceptions f^4.7.5V LineNumberTable 
IMJM. and Local-variablerable iMJJl attributes. Within the context of their use in this specification 
15 that IS. m the attributes tables of the class file stnicnires in which they appear, the names of these 

predefined attributes are reserved. 

Of the predefined attributes, the code. Consta^tvalue, and Exceptions amibutes must be recognized and 
correctly read by a class file reader for correct interpretauon of the class file by a Java Virtual Machine Use 
of the remaining predefined attributes is optional; a class file reader may use the information they contain and 
20 otherwise must silendy ignore those attributes. 

4,7.1 Defining and Naming New Attributes 

Compilers for Java source cocte are permitted to define and emit class files containing new attributes in the 
attributes tables of class fik Structures. Java Virtual Machine implementations are permitted to reccsnizc 
25 and use new aitnbutes found in die attributes tables of class file structures. However, all attributes not 

defined as part of this Java Viitual Machine specification must not affect the semantics of class or interface 
types, Java Virroal Machine iixqilcmentations are required to silendy ignore attributes they do not recognize. 

For mstance, defining a new attribute to support vendor-specific debugging is permitted. Because Java Virtual 
Machine implementations are required to igiiore attributes they do not recognize, class files intended for that 
parucular Java Vutual Machine implcmcntadon will be usable by other implementadons even if those 
unplcmcntations cannot make use of the additional debugging information that the class files contain. 

Java Virtual Machine implementations are specifically prohibited from throwing an exception or odierwise 
refusing to use class files simply because of the presence of some new attribute. Of course, tools ooeraung 
on class files may not run correctly if given class files that do not contain all the attributes they require. 

Two attributes that are intended to be distinct, but that happen to use the same amibutc name and are of :he 
same length, will conflict on implementations that recognize either attribute. Attributes defined otijer Lhan by 
Sun must have names chosen according to the package naming convention defined by The Java Language 
Specification . For instance, a new attribute defined by Netscape might have the name 
•COM. Netscape. new-attribute". 

Sun may define additional attributes in fiiture versions of this class file specification. 
4.7.2 SourceFile Attribute 

TTic Sour ceFiie attribute is an optional fixed-length attribute in the attributes table of theciassf ile 
LsUj structure. There can be no more than one SourceFile atuibute in the attributes table of a given 
Class File Structure. 

The sourceFi le attribute has the format 
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SourceFile.accribuce ( 

u2 attribute.name.index; 
u4 accribuce.length; 
^ u2 sourcefile.index; 

) 

The items of the sourceFile_at tribute stnicturc are as follows: 
attribute_name_index 

10 

The value of the accribute_naine_index item must be a valid index into the constanc_pool 
table. The constant .pool entry at that index must be a coNSTANT_ut£8_in£o 
structure representing the string -sourceFlle-. 

attributejength 

The value of the attribute.length item of a sourceFile.at tribute structure must be 2. 
sourcefile.index 

The value of the source £ile_index item must be a valid index into the cons tantj»ol 
table. The constant pool cntiy at that index must be a coNSTANT.Utf 8_inf o ^^4.4.7^ 
stiucture representing the string giving the name of the source file from which this class file 
was compiled. 

Only the name of the source file is given by the SourceFlle attribute. It never represents the 
^ . name of a directory containing the file or an absolute path name for the file. For instance, the 

SourceFiie attribute might contain the file name £oo . java but not the UNIX pathname 
/hoine/Iindholn/ £oo .java. 

4.7.3 ConstantValue Attribute 

30 The ConstantValue attribute is a fixed-length attribute used in the attributes table of the £xeld_in£o 

(^4.5^ strucnires. A ConstantValue attribute represents the value of a constant field that must be (explicitly or 
implicitly) static; that is. the acc_static bit (liable 4.3^ in the £lags item of the field_info structure 
must be set. The field is not required to be final. There can be no more than one ConstantValue attribute in 
the attributes table of a given f ield^inf o Structure. The constant field represented by the £ield_inf o 
structure is assigned the value referenced by its ConstantValue attribute as part of its initialization f 62. 16.4V 

35 

Every Java Virtual Machine implementation must recognize ConstantValue attributes. 
The ConstantValue attribute has the format 
Cons tantvalue.at tribute ( 

40 

u2 attribute.name.index; 
u4 attribute_lengthj 
u2 constantvalue_index; 

) 

45 The items of the Constantvalue^attribute structure are as follows: 
attribute.name.index 

The value of the attrlbute.name.lndex item must be a valid index into the cons tan t_pool 

50 
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tabic. The constant j>ool cniry at ihai index must be a CONSTAwr.ucf S^info (^4.4.71 
structure representing the string -ConscancValuev 

attributejength 

The value of the accribuce.iengch item of a ConscantValue_attribute Structure must be 
2. 

constantvalue.index 

The value of the cons can tvaiue_index item must be a valid index into the const an t_pooi 
tabic. The conscant_pool entry at that index must give the constant value represented by this 
attribute. 

The constant.j>ool entry must be of a type appropriate to the field, as shown by Table 4.5. 



Field Type Entry Type 

long CONSTANT^Long 
float C0NSTANT_Float 
20 double CQKSTANT.tDouble 

int, short f char, hyto, hooloan CQNSTANT_Integer 
iava . lang.StrlDff C<»ISTANT_String 
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4.7.4 Code Attribute 



The Code attribute is a variable -length attribute used in the attributes table of method_inf o structures. A 
Code atoibutc contains the Java Virtual Machine instructions and auxiliary information for a single Java 
method, instance initialization method (S3.8\ or class or interface initialization method f^3.8V Every Java 
30 Virtual Machine implementation must recognize code attributes. There must be exacdy one Code attribute in 
each method_inf o Structure. 

The Code attribute has the format 

Code^ct tribute { 

35 

u2 attribute_name_index; 
u4 attribute_lengch; 
u2 inax_stack; 
u2 max_locals; 
u4 code_lengch; 
4Q ul codelcode^length] ; 

u2 exception_table_length; 
( u2 start_pc; 

u2 end_pc; 
u2 handler_pc; 
u2 catch_type; 
} exception_table(exception_table_lengthl ; 

u2 attributes.count: 

aetribute.info ac tributes tat tribuces.coune) ; 



45 



) 

The items of the Code.at tribute structure arc as follows: 
attribute name.index 



55 



29 



EP0 913 769 A2 



The value of the accribuce^nanie.index item must be a valid index into the constant_pool 
table. The constanc_pool entry at that index must be a CONSTAm'.Utf 8_inf o 
structure representing the string 'Code-. 

attribute_length 

The value of the attribuce.iength item indicates the length of the attribute, excluding the 
initial six bytes. 

max.stack 

The value of the max.s tack item gives the maximum number of words on the operand stack 
at any point during execution of this method. 

maxjocals 

The value of the max.iocals item gives the number of local variables used by this ihethod, 
including the parameters passed to the method on invocation. The index of the first local 
variable is o. The greatest local variable index for a one-word value is inax_locals-i. The 
greatest local variable index for a two-word value is inax_iocais-2. 

codcjength 

The value of the code_Xengch item gives the number of bytes in die code array for this 
method. The value of code.iengch must be greater than zero; the code array must not be 
empty. 

code[] 

The code array gives the actual bytes of Java Virtual Machine code that implement the 

method. 

When the code array is read into memory on a byte addressable machine, if the first byie of 
the array is aligned on a 4-byte boundary, the tableswitch and lookupsMfitch 32-bii offsets 
will be 4-byie aligned; refer to the descriptions of diose instructions for more information on 
the consequences of code array alignment. 

The detailed constraints on the contents of the code array are extensive and are given in a 
separate section ($4.8V 

exception_table_length 

The value of the except ion_table_lengch item gives the number of entries in the 
except ion_table table. 

exception_tabIe(] 

Each entry in the exception_table array describes one exception handler in the code array. 
Each except ion^table entry contains the following items: 

slart^pc, endjc 

The values of the two items start_pc and end_pc indicate the ranges in the code array at 
which the exception handler is active. The value of start^pc must be a valid index into the 
code array of the opcode of an instruction. The value of end_pc either must be a valid index 
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into the code array of ihc opcode of an instruccion. or must be equal to code.length. the 
length of the code array. The value of scart_pc must be less than Ac value of end j>c. 

5 The stort_pc is inclusive and end_pc is exclusive; that is, the exception handler must be 

active while the program counter is within the interval [start_pc, end^),^ 

handler^pc 

The value of the handler_pc item indicates the start of the exception handler. The value of 
10 the item must be a valid index into the code array, must be the index of the opcode of an 

instruction, and must be less than the value of the code^engch itcoL 

catch_typ€ 

If die value of die catch^type item is nonzero, it must be a valid index into die 
15 con5tant..pool table. The constanc^^ol entry at that index must be a 

CONSTANT_ciass_inf o (§4.4.1) strucuire rcpresendng a class of cxccpdons that this 
excepdon handler is designated to catch. This class must be the class Throwable or one of its 
subclasses. The excepdon handler will be called only if the thrown excepdon is an instance of 
the given class or one of its subclasses. 

20 If die value of die catch_cype item is zero, this excepdon handler is called for all excepdons. 

This is used to implement finally (sec Section 7.13. '^Ctompilinf finalWy 

attributes.count 

The value of the atcribuces.count item indicates the number of attributes of the Code 
25 attnbute. 

attributes[] 

Each value of the attributes table must be a variable-length attribute structure. A Code 
attribute can have any number of optional attributes associated with it 

30 

Currendy, die LineNumberTable ($4.7.6) and Local VariableTable (54.7.7) attributes, 
bodi of which contain debugging informadon. are defined and used widi die Code attribute. 

A Java Virtual Machine implementation is permitted to silendy ignore any or all attributes in 
the attributes table of a Code attribute. Attributes not defined in this specificauon are not 
^ allowed to affect the scmandcs of die class file, but only to provide additional descriptive . 

information (54.7.0 . 

4,7.5 Exceptions Attribute 

^ The Exceptions attribute is a variable-Iengdi atuibute used in die attributes table of a method^inf o (S4.6) 

suucture. The Exceptions attribute indicates which checked exceptions a method may dirow. There must be 
exacdy one Exceptions attribute in each method^info Structure. 

The Exceptions attribute has the format 

^ Except ions_at tribute { 

u2 attribute_naiae_index; 

u4 attribute_length; 

u2 nuinber_of_exccptions; 

u2 exceptional ndex_table tnumber_of_exceptions| ; 

50 
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The items of (he Except ions.accribute Structure are as follows: 
attribute.name.index 

5 

The value of the attr ibute^naree^index item musi be a valid index into the constant_pool 
table. The constanc^pooi entry at that index must be the constant^ucI 8_info (^4.4.7^ 
structure representing the suing "Exceptions-. 

attributejength 

10 

The value of the atcribuceulength item indicates the attribute length, excluding the initial 
six bytes, 

number_of_exceptions 

IS The value of the nunber.of.exceptions item indicates the number of entries in the 

except ion_index_ table. 

exception Jndex.tableQ 

Each nonzero value in the except ion_index_table array must be a valid index into the 
20 con5tant_pool table. For each table item, if exceptioxuindex;.table (/ ) ! « 0, where o £ 

J < number.of.exceptions. then the con$tant_pool entry at index 
. exception_index.table li) muSt bc a CONSTAIIT_Class_inf o f^All Structure 

representing a class type diat this method is declared to throw. 
A method should only throw an exception if at least one of the following duee criteria is met: 

25 

m The exception is an instance of RuntimeCxception or one of its subclasses. 
« The exception is an instance of Error or one of its subclasses. 

e The exception is an instance of one of the excepdon classes specified in the exception_index. cable 
above, or one of their subclasses. 

30 The above requirements are not cuncndy enforced by die Java Virtual Machine; they are only enforced ai 
compile ume. Future versions of the Java language may require more rigorous checking of throws clauses 
when classes are verified. 

4.7.6 LineNumberTable Attribute 

^ The LineNuinberTable attribute is an Optional van able- length attribute in the attributes table cf 2 Code 

($4.7.4) attribute. It may be used by debuggers to determine which pan of the Java Virtual Machine code array 
corresponds to a given line number in the original Java source fde. If LineNumberTable attributes are present 
in the attributes table of a given Code attribute, then they may appear in any order. Furthermore, multiple 
LineNumberTable attributes may together represent a given line of a Java source file; that is, 

40 LineNumberTable attributes need ROt be one-to-one with source Unes.^ 
The LineNuinberTable attribute has the format 
LineNximberTable^attribute ( 

^ u2 a t tribute _name_ index; 

u4 attribute_length; 

u2 line_nuinber_table_length; • • 

( u2 start_pc; 
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u2 line_number; 
) line.number.cable ( line.nvunber.cable.lengch) ; 

> 

The items of the LineNunberTable.ac tribute Structure are as follows: 
attrtbute.namejndex 

The value of the attribute^name.index item must be a valid index iniD the constant^pool 
table. The constant_pooi entry at that index must be a coNSTANT.utf 8_inf o ($4.4.7) 
siniciurc representing the string "LineNumberTable'. 

attributejength 

The value of the atcribute^iength item indicates the length of the attribute, excluding the 
initial six b/tes. 

line_number_tablejength 

The value of the line_nuinber_table_length item indicates the number of entries in the 

line_nuinbe ratable array. 

line_nuraber_table[] 

Each entry in the line.number.table array indicates that the line number in the original Java 
source file changes at a given point in the code array. Each entry must contain the following 

items: 

* start_pc 

The value of the stare _pc item must indicate the index into the code array at which the code 
for a new line in the original Java source file begins. The value of start jpc must be less than 
the value of the code_length item of the Code attribute of which this LincNumberTable is 
an attribute. 

line.number 

The value of the iine_nuinber item must give the corresponding line number in the original 
Java source file. 

4.7.7 LocalVariableTable Attribute 

The LocalVariableTable attribute is an optional variable-length attribute of a Code (^4.7.4) attribute. Tt may 
be used by debuggers to determine the value of a given local variable during the execution of a method. If 
LocalVariableTable attributes arc present in the attributes table of a given Code attribute, then they may 
appear in any order. There may be no more than one LocalVariableTable attribute per local variable in the 
Code attribute. 

The LocalVariableTable attribute has the formal 
LocalVariableTable_attribute ( 

u2 attribute_naroe_index; 

u4 attribuce_lengch; 

u2 local_variable_table.length; 

{ u2 start_pc; 
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u2 length: 
u2 name.index; 
u2 descriptor.index; 
u2 index; 
} local_vari«ble_Cable( 

local_variable_tabl€_lengchI ; 

) 

The items of the LocalVariableTable.atcribuce structure are as follows: 
attribute.namejndex 

The value of the accribute_name_index item must be a valid index into the cons tan t^pool 
table. The constant^pooX entiy at thai index must be a coNSTANT.utf s.inf o (64.4.7^ 
structure representing the string "LocalVariableTable-. 

attributejength 

The value of the attribute.length item indicates the length of the anribute. excluding the 
initial six bytes. 

local.variable_table.Iength 

The value of the iocal_variabie_table_length item indicates the number of entries in the 

local_variable__tabl€ array. 

local_variable_table[] 

Each entry in the local.variable.table array indicates a range of code array offsets 
within which a local variable has a vahie. It also indicates the index into the local variables of 
the current frame at which that local vaiiable can be found Each entry must contain the 
following items: 

start_pc, length 

The given local variable must have a vahie at indices into the code array in the interval 
[start_pc, start_pc+ length], that is, between start_pc and s car t_pc-t' length 
inclusive. The value of startupc must be a valid index into the code array of this Code 
attribute of the opcode of an instruction. The value of startle* length must be cither a 
valid index into the code array of this Code attribute of the opcode of an instruction, or the 
first index beyond the end of that code array. 

namc_index, descriptor_index 

The value of the name_index item must be a valid index into the constanc_pool table. The 
constant_pooi entry at that index must contain a coNST.wr_utf8_info f^4.4.7^ structure 
representing a valid Java local variable name stored as a simple name £§2iil- 

The value of the descriptor^index item must be a valid index into the constancjpooi 
table. The constanc_pool entry at that index must contain a C0NSTANT_utf 8_info 
structure representing a valid descriptor for a Java local variable. Java local variable 
descriptors have the same form as field descriptors ($4.3.2) . 

index 

The given local variable must be at index in its method's local variables. If the local variable 
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at index is a two-word type (double Of long), it occupies both index and index*l. 



4.8 Constraints on Java Virtual Machine Code 

The Java Virtual Machine code for a method, instance initialization method or class or interface 
initialization method fS3.8) is stored in the code anray of the code attribute of a inechod_in£o structure of a 
class file. This section describes the constraints associated with the contents of the Code.accribuce 
strucmre. 

4,8.1 Static Constraints 

The stone consrramts on a class file arc those defining the well-formedness of the file. With the exception of 
the static constraints on the Java Virtual Machine code of the class file, these constraints have been given in 
the previous section. The static constraints on the Java Virtual Machine code in a class filaspecify how Java 
Virtuai Machine instructions must be laid out in the code array, and what the operands of individual 
instructions must be. 
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The static constraints on the instructions in the code array arc as follows: 

• The code array must not be empty, so the code.ieiigth attribute cannot have the vahie 0. 

• The opcode of the fust instruction in the code array begins at index 0. 

• Only instances of the instructions documented in (^.4^ may appear in the code array. Instances of 
instructions using the reserved opcodes (^6.2Y the jjuick opcodes documented in Ch^ter 2. lAfl 
Optimization." or any opcodes not documented in diis specification may not appear in the code array. 

• For each instruction in (he code anray except the last, the index of the opcode of the next instruction 
equals the index of the opcode of the conent instruction plus the length of that instruction,' including 
all its operands. The wide instrucdon is treated like any other instruction for these poiposes; the 
opcode specifying the operation that a wide instruction is to modify is treated as one of the operands 
of that wide instruction. That opcode must never be direcdy reachable by the computation. 

• The last byte of the last instruction in the code array must be the byte at index code_lengch-i. 



The static constraints on the operands of instructions in the code array are as follows: 

• The target of each jump and branch instruction Qsr . jsr_w , goto . gotojw , ifeq , ifiie , iflf Jfge, 
ifgt, ifleJfnuU, iftionnuH, ifjcmpeq , ifjcmpne^ ^Jcmpit, ifjcmpge,ifjcmpgt^ fjcmple, 
ifjaanpeg , ifjxcmpne ) must be the opccxle of an insmiction within diis method. The target of a jump 
oFbranch instruction must never be the opcode used to specify die operation to be modified by a wide 
instruction; a jump or branch target may be the wide instruction itself. 

• Each target, including the default, of each taSAeswisch instruction must be the opcode of an instruction 
within this method. Each tableswitch insuuction must have a number of entries in its jump table that is 
consistent widi its low and high jump table operands, and its low value must be less than or equal to 
its high value. No target of a tableswitch instruction may be die opcode used to specify the operation 
to be modified by a wide instruction; a tableswitch target may be a wide instruction itself. 

• Each target, including die default, of each lookupswitch instniction must be the opcode of an 
instruction within this raediod. Each lookupswitch instruction must have a number of match-offset 
pairs that is consistent with its npairs operand. The match-offset pairs must be soned in increasing 
numerical order by signed match value. No target of a lookupswitch instruction may be die opcode 
used to specify the operation to be modified by a wide instruction; a lookupswitch target may be a 
wide instniction itself 

• The opeiand of each Idc. and Idcjw instniction must be a valid index into the conscant_pool table. 
The constant pool entry referenced by diat index must be of type C0NSTANT_integer, 

CONSTAMT^Float, or CONSTANT_String. 

• The operand of each ldc2^w instruction must be a valid index into the conscanc.jK>ol table. The 
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constant pool entry referenced by that index must be of type coNSTAwr^Long or C0NSTAiiT_doubie. 
In addition, the subsequent constant pool index must also be a valid index into the constant pool, and 
the constant pool entry at that index must not be used. 

• The operand of each geifield , putfield , gttstaiic , and putstatic instruction must be a valid index into 
the constancjsooi table. The constant pool entry referenced by that index must be of type 

CONSTANT^Fieldref. 

• The index operand of each invoktvimal , invokespecial , and invokestatic instruction must be a valid 
index into the constant_pool table. The constant pool entry referenced by that index must be of type 
CONSTANT^Mechodre f . 

• Only the invokespecial instruction is allowed to invoke the method <init>, the instance initialization 
method ($3.8V No other method whose name begins with the character • < • ( • u003c ' ) may be called 
by the method invocation instructions. In particular, the class initialization method <clinit> is never 
called explicidy from Java Virtual Machine instructions, but only implicitly by the Java Virmal 
Machine itself. 

• The index operand of each invokeinterface instruction must be a valid index into the cQfDstant.j>ool 
table. The constant pool entry referenced by that index must be of type 

coNSTANT_incerf aceMechodref. The value of the nargs operand of each invokeintefface instructioa 
must be the same as the number of argument words implied by the descriptor of the 
CONSTANT_»ameAndType_inf o Structure referenced by the CONSTANT.Inter f aceMethodref 
constant pool entry. The fourth operand byte of each invokeinterface instruction must have the value 
zero. 

• The index errand of each instanceaf, checkcast, new , anewarmy , and multi-anewamxy instruction 
must be a valid index into the constancj)ooi tabic. The consiani pool entry referenced by that index 
must be of type coNSTANT_ciass. 

• No anewarray instruction may be used to create an array of more than 255 dimensions. 

o No new instruction may reference acoNSTAMT_ciass constant_pool table entry representing an 
array class. The new instruction cannot be used to create an array. The new instrucdonalso cannot 
be used to create an interface or an instance of an abstract class, but those checks arcj)erformed ai 
link time. 

• A muUianewarray instruction must only be used to create an array of a type that has at least as many 
dimensions as the value of its dimensions operand. That is. while a mulrianewarray instruction is not 
required to create all of the dimensions of the array type referenced by its C0NSTANT_ciass operand, it 
must not attempt to create more dimensions than are in the array type. The dimensions operand of 
each multianewarray instruction must not be zero. 

• The atype operand of each newarray instruction must take one of the values t_boolean (4), t_CHAR 

(5), T_FLOAT (6). T_IX>treLE (7). T_BYTE (S), T_SHORT (9), T^INT (XO). Or T_L0NG (ll). 

• The index operand of each iload , flood . alood , istore Jstore . astore , wide , iinc . and ret instruction 
must be a namral number no greater than inax_locals-i. 

• The implicit index of each iload_<n> ,fload_<n> , aload_<n> . istore_<n> ,fstore_<n> , and 
astore <n> instruction must be no greater than the value of max^locals-l. 

• The iiidex operand of each Uoad , dload . biore , and dstore instruction must be no greater than the 

value of inax_locals-2. 

• The implicit index of each lload_<n> , dload_<n> , lstore_<n> , and dstore _<n> instmcdon must be 
no greater than the value of niax_locals-2. 

4,8«2 Structural Constraints 

The structural constraints on the code array specify constraints on relationships between Java Virtual Machine 
instructions. The scnicmral constraints are as follows: 

• Each instruction must only be executed with the appropriate type and number of arguments in the 
operand stack and local variables, regardless of the execution path that leads to its invocation. An 
instruction operating on values of type int is also permitted to operate on values of type byte, char, 
and short. (As noted in ^3.1 I.I , the Java Virtual Machine internally converts values of types byte, 
char, and short to type int.) 
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• Where an instniction can be executed along several different execution paths, the operand stack must 
have the same size prior to the execution of the instniction. regaidtess of the path taken. 

• At no point during execution can the order of the words of a two-word type (long or double) be 
reversed or split up. At no point can the words of a two-word type be operated on individually. 

• No local variable (or local variable pair, in the case of a two-wofd type) can be accessed before it is 
assigned a value. 

• At no point during execution can the operand stack grow to contain more than max^stack words. 

• At no point during execution can more words be popped from the operand stack than it contains. 

• Each invokespecial instruction must name only an instance initialization method <inic>, a method in 
this, a private method, or a method in a superclass of this. 

• When the instance initialization method <init> is invoked, an uninitialized class instance must be in 
an appropriate position on the operand stack. The <iiiit> method must never be invoked on an 
initiahzed class instance. 

• When any instance method is invoked, or when any instance variable is accessed, the class instance 
that contains the instance method or instance variable must already be initialized. 

• There must never be an uninitialized class instance on the operand stack or in a local variable when any 
backwards branch is taken. There must never be an uninitialized class instance in a local variable in 
code protected by an exception handler or a finally clause. However, an uninitialized class instance 
may be on the operand stack in code protected by an exception handler or a finally clause. When an 
exception is thrown, the contents of the operand stack are discaxded. 

• Each instance initialization method ($3.81 except for die instance initialization method derived from 
the constructor of class ob j ecc. must call either another instance initialization method of this or an 
instance inidalization method of its immediate superclass super before its instance members are 
accessed. However, this is not necessary in the case of class Object, which does not have a 
superclass f^2.4.6V 

• The arguments to ecch method invocation must be method invocation compatible (§2.6.7^ with the 
method descriptor f^4>3.3\ 

• An abstract method must never be invoked, 

• Each return instruction must match its method's return type. If the method returns a byte, char, 
short, or int. only the ire turn instruction may be used. If the method returns a float, long, or 
double, only znf return . [return , or dretum instruction, respectively, may be used. If the meihod 
returns a reference type, it must do so using an aretum instruction, and the relumed value must oe 
assignment compatible ($2.6.6> with the return descriptor f $4.3.31 of the method. All instance 
initialization methods, static initializers, and methods declared to return void must only use the return 
instruction. 

• Ugetfield orputfield is used to access a protected field of a superclass, then tiie type of ihe c:ass 
instance being accessed must be the same as or a subclass of die current class. If invokevinual is used 
to access a protected method of a superclass, (ticn the type of the class instance being accessed must 
be the same as or a subclass of the current class. 

• The type of every class instance loaded from or stored into by a getfield or putfield instruction must 
be an instance of the class type or a subclass of the class type, 

e The type of every value stored by a putfield or putstatic instruction must be compatible with the 
descriptor of the field ($4.3.21 of the class instance or class being stored into. If the descriptor t>'pe is 
byte, char, short, or int, then the value must be an int. If the descriptor type is float, lone, or 
double, then the value must be a float, long, or double, respectively. If the descriptor :ype is a 
reference type, then the value must be of a type that is assignment compatible ($2.6.6) with die 
descriptor type. 

• The type of every value stored into an array of type reference by an aastore instruction mus: be 
assignment compatible (^2.6.6) with the component type of the array. 

• Each athrow instruction must only throw values that are instances of class Thxrowable or of 

subclasses of Thr owable. 

• Execution never falls off the bottom of the code array. 

• No return address (a value of type returnAddress) may be loaded from a local variable. 

• The instruction following each jsr or jsr_w instruction only may be relumed to by a single ret 
instruction. 

• No jsr or jsr^w instruction may be used to recursively call a subroutine if that subroutine is already 
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present in the subroutine call chain. (Subroutines can be nested when using try- finally constructs 
from within a finally clause. For more information on Java Virtual Machine subroutines, see 

• Each instance of type returnAddress can be returned to at most once. If a ret instruction returns to a 
point in the subroutine call chain above the ret instruction corresponding to a given instance of type 
returnAddress. then that instance can never be used as a return address. 



4.9 Verification of class Files 

Even though Sun's Java compiler attempts to produce only class files that satisfy ail the static constraints in the 
previous sections, the Java Virtual Machine has no guarantee that any file it is asked to load was generated by 
that compiler, or is properly formed. Applications such as Son's HocJava World Wide Web browser do not 
download source code which they then compile; these applications download already- compiled class files. 
The HotJava browser needs to determine whether the class file was produced by a tiustwocthy Java compiler 
or by an adversary attempting to exploit the interpreter. 

An additional problem with compile-time checking is version skew. A user may have successfully compiled a 
class, say PurchaseStockOptions, tO be a subclass of TradingClass. But the definition of TradingClass 
might have changed in a way that is not compatible with preexisting binaries since the time the class was 
compiled. Methods might have been deleted, or had their return types or modifiers changed. Fields might have 
changed types or changed from instance variables to class variables. The access modifiers of a method or 
variable may have changed from public to private. For a discussion of these issues, see Chapter 13. 
"Binary Compatibility." in The Java Language Spec^ation . 

Because of these potential problems, the Java Virtual Machine needs to verify for itself that the desired 
constraints hold on the class files it attempts to incorporate. A well- written Java Virmal Machine emulator 
could reject poorly formed instructions when a class file is loaded. Other constraints could be checked at run 
time. For example, a Java Virtual Machine implementation could tag runtime data and have each instruction 
check that its operands are of the right type. 

Instead, Sun's Java Virtual Machine implementation verifies that each class file it considers untrustworthy 
satisfies the necessary constraints at linking time ($2.16.3) . Structural constraints on the Java Virtual Machine 
code are checked using a simple theorem prover. 

Linking-time verification enhances the performance of the interpreter. Expensive checks that would otherwise 
have to be performed to verify constraints at run time for each interpreted instruction can be eliminated. The 
Java Virmal Machine can assume that these checks have already been performed. For example, die Java Virmal 
Machine will ahtady know the following: 

• There are no operand stack overflows or underflows. 

• All local variable uses and stores are valid 

• The arguments to all the Java Virtual Machine instructions are of valid types. 

Sun's class file verifier is independent of any Java compiler. It should certify all code generated by Sun's 
current Java compiler; it should also certify code that other compilers can generate, as well as code that the 
current compiler could not possibly generate. Any class file that satisfies the structural criteria and static 
constraints will be cenified by the verifier. 

The class file verifier is also independent of the Java language. Other languages can be compiled into the 
class format, but will only pass verification if they satisfy the same constraints as a class file compiled from 
Java source. 

4.9.1 The Verification Process 
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The class file verifier operates in four passes: 

Pass 1: When a prospective class file is loaded ($2.16.2) by the Java Virtual Machine, the Java Virtual 
Machine first ensures that the file has the basic format of a Java class file. The first four bytes must contain 
the right magic number. All recognized attributes must be of the proper length. The class file must not be 
truncated or have extra bytes at the end. The constant pool must not contain any superficially unrecognizable 
information. 

While class file verification properly occurs during class linking ($2.16.3) . this check for basic class file 
integrity is necessary for any inteqsretation of the class file contents and can be considered to be logically part 
of the verification process. 

Pass 2: When the class file is linked, the verifier performs all additional verification that can be done 
v/ithout looking at the code array of the code attribute (^.7.4). The checks perfomiBd by this pass include the 
following: 

• Ensuring that £inal classes are not subclassed, and that final methods are not overridden. 

• Checking that every class (except object) has a superclass. 

• Ensuring that the constant pool satisfies the documented static constraints; for example, class 
references in the constant pool must contain a field that points to a coNSTANTjutfS string reference in 
the constant pool. 

• Checking that all field references and method references in the constant potri have valid names, valid 
classes, and a valid type descriptor. 

Note that when it looks at field and method references, this pass does not check to make sure that the given 
field or method actually exists in the given class; nor does it check that the type descriptors given refer to real 
classes. It only checks that these items are well formed. More detailed checking is delayed until passes 3 and 4. 

Pass 3: Still during linking, the verifier checks the code array of the Code attribute for each method of the 
class file by performing data-flow analysis on each method. The verifier ensures that at any given point in the 
program, no matter what code path is taken to reach diat point: 

• The operand stack is always the same size and contains the same types of objects. 

• No local variable is accessed unless it is )axown to contain a value of an appropriate type. 

• Methods are invoked with the appropriate arguments. 

• Fields are assigned only using values of appropriate types. 

• All opcodes have appropriate type arguments on the operand stack and in the local variables. 
For further information on this pass, see Section 4.9.2. 'The Bytecode Verifier." 

Pass 4: For efficiency reasons, certain tests that could in principle be performed in Pass 3 are delayed until 
the first time the code for the method is acmally invoked. In so doing, Pass 3 of the verifier avoids loading 
class files unless it has to. 

For example, if a method invokes another method that returns an instance of class A, and that instance is only 
assigned to a field of the same type, the verifier does not bother to check if the class A actually exists. 
However, if it is assigned to a field of the type b, the defmitions of both a and B must be loaded in to ensure 
that A is a subclass of a. 

Pass 4 is a vinual pass whose checking is done by the appropriate Java Virtual Machine instructions. The first 
time an instruction that references a type is executed, the executing instruction docs the following: 

• Loads in the definition of the referenced type if it has not already been loaded. 

• Checks thai the currently executing type is allowed to reference the type. 

• Initializes the class; if this has not already been done. 
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The first (ime an insmicuon invokes a method, or accesses or modifies a field, the executing tnstniction does 
the following: 

• Ensures that the referenced method or fieid exists in the given class. 

• Checks that the referenced method or field has the indicated descriptor. 

• Checks that the currently executing method has access to the referenced method or field. 

The Java Virtual Machine docs not have to check the type of the object on the operand stack. That check has 
already been done by Pass 3. £nx)rs that ait detected in Pass 4 cause instances of subclasses of LinkageError 

to be thrown. 

A Java Vinual Machine is allowed to peifoixn any or all of the Pass 4 steps, except for class or inteiface 
initialization, as part of Pass 3; see 2.16.1. "Virtual Machine Start-up" for an example and more discusdon. 

In Sun's Java Virtual Machine implementation, after Che verification has been performed, the instrucdoa in the 
Java Virtual Machine code is replaced with an alternative form of the instruction (sec Chapter 2, "An 
Qptimization"y For example, the opcode new is replaced with new_quic)c This alternative instruction indicates 
that the verification needed by this instruction has taken place and does not need to be performed agaia 
Subsequent invocations of the method will thus be faster. It is illegal for these alternative instnictioQ fomis to 
appear in class files, and they should never be encountered by tbe verifier. 

4.9.2 The Bytecode Verifier 

As indicated earlier. Pass 3 of the verification process is the most complex of the four passes of class file 
verification. This section looks at the verification of Java Virtual Machine code in more detail. 

The code for each method is verified independently. First, the bytes that make up the code are broken up into a 
sequence of instructions, and the index into the code array of tbe start of each instruction is placed in an array. 
The vsrifier then goes through the code a second time and parses the instructions. During this pass a data 
Structure is built to hold information about each Java Virtual Machine instruction in the method. The operands, 
if any, of each instmction are checked to make sure they are valid. For instance: 

• Branches must be within the bounds of the code array for the method. 

• The targets of all control-flow instructions are each the start of an instruction. In the case of a ¥nde 
instruction, the wide opcode is considered the start of the instruction, and the opcode giving the 
operation modified by that wide instruction is not considered to start an instruction. Branches into the 
middle of an instruction are disallowed. 

• No instruction can access or modify a local variable at an index greater than the number of local 
variables that its method indicates it uses. 

• All references to the constant pool must be to an entry of the appropriate type. For example: the 
instruction Idc can only be used for data of type inc or f loac, or for instances of class string; the 
instruction getfieid must reference a field. 

• The code does not end in the middle of an instruction. 

• Execution cannot fall off the end of the code. 

• For each exception handler, the staning and ending point of code protected by the handler must be at 
the beginning of an instruction. The staning point must be before the ending point The exception 
handler code must start at a valid instrucdon, and it may not start at an opcode being modified by the 

wide instruction. 

For each instruction of the method, the verifier records the contents of the operand stack and the contents of the 
local variables prior to the execution of that instmction. For the operand stack, it needs to know the stack 
height and the type of each value on it. For each local variable, it needs to know either the type of the contents 
of that local variable, or that the local variable contains an unusable or unknown value (it might be 
uninitialized). The bytecode verifier docs not need to distinguish between the integral types (e.g.. byte, 
short, char) when determining the value types on the operand stack. 
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Next, a data-flow analyzer is initialized. For the first instruction of the method, the local variables which 
represent parameters initially contain values of the types indicated by the method's type descriptor the operand 
Slack is empty. All other local variables contain an illegal value. For the other instructions, which have not 
5 been examined yet« no informacion is available regarding the operand stack or local variables. 

Finally, the data-flow analyzer is run. For each instruction, a "changed" bit indicates whether this instruction 
needs to be looked at. Initially, the "changed" bit is only set for the first instruction. The data-flow analyzer 
executes the following loop: 

1 . Select a virtual machine instruction whose "changed" bit is set. If no instruction remains whose 
"changed" bit is set. the method has successfully been verified. Otherwise, turn off the "changed" bit 

• of the selected instruction. 

2 . Model the effect of the instruction on the operand stack and local variables: 

o If the instruction uses values from the operand stack, ensure that there are a sufficient number 

of values on the stack and that the top values on the stack are of an appropriate type. 
,5 Otherwise, verification fails. 

o If the instruction uses a local variable, ensure that the specified local variable contains a value 

of the appropriate type. Otherwise, verification fails, 
o If the instruction pushes values onto the operand stack, ensure that there is sufficient room on 

the operand stack for the new vahies. Add the indicated types to the top of the modeled 

operand stack. ' 

20 o If the instruction modifies a local variable, record that the local variable now contains the new 

type. 

3 . Determine the instructions that can follow the cunent instruction. Successor instructions can be one of 
the following: 

o The next instruction, if the current instruction is not an unconditional conUx>l transfer 
instruction (for instance goto, return or athrow ). Verification fails if it is possible to "fall ofT 
25 the last instruction of the method. 

o The target(s) of a conditional or unconditional branch or switch, 
o Any exception handlers for this instruction. 

4. Merge the state of the operand stack and local variables at the end of the execution of the current 
instruction into each of the successor instructions. In the special case of control transfer to an 
exception handler, the operand stack is set to contain a single object of the exception type indicated by 

^ the exception handler information. 

o If this is the first time the successor instruction has been visited, record that the operand stack 
and local variables values calculated in steps 2 and 3 are the state of the operand stack and 
local variables prior to executing the successor instruction. Set the "changed" bit for the 
successor instruction. 

o If the successor instruction has been seen before, merge the operand stack and local variable 
35 values calculated in steps 2 and 3 into the values already there. Set the "changed" bit if there is 

any modification to the values. 

5. Continue at step 1. 

To merge two operand stacks, the number of values on each stack must be identical. The types of values on the 
stacks must also be identical, except that differently typed reference values may appear at corresponding 
places on the two stacks. In this case, the merged operand stack contains a reference to an instance of the 
first common superclass or common superinierface of the two types. Such a reference type always exists 
because the type Objecc is a supertype of all class and interface types. If the operand stacks cannot be merged, 
verification of the method fails. 

To merge two local variable states, corresponding pairs of local variables are compared. If the two types are 
^ not identical, then unless both contain reference values, the verifier records that the local variable contains an 

unusable value. If both of the pair of local variables contain reference values, the merged state contains a 
reference to an instance of the first common superclass of the two types. 



so 
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If the data-flow anaJyzcr runs on a method without reporting a verification failure, then the method has been 
successfully verified by Pass 3 of the class file verifier. 

5 Certain instructions and data types complicate the data*flow analyzer. We now examine each of these in more 

detail. 

4.9.3 Long Integers and Doubles 

Values of the long and double types each take two consecutive words on the operand stack and in the local 

variables. 

Whenever a ion? or double is moved into a local variable, the subsequent local variable is marked as 
containing the second half of a long or double. This special value indicates that all references to the long of 
double must be through the index of the lower-numbered local variable. 

Whenever any value is moved to a local variable, the preceding local variable is examined to see if it contains 
the first word of a long or a double. If so, that preceding local variable is changed to indicate that it now 
contains an unusable value. Since half of the long or double has been overwritten, the other half must no 
longer be used. 

Dealing with 64-bit quantities on the operand stack is simpler, the verifier treats them as single units on the 
^ Stack. For example, the verification code for the d^uld opcode (add two double values) checks that the top two 

items on the stack are both of type double. When cakuladng operand stack fength, values of type long and 
double have length two. 

Untyped instructions that manipulate the operand stack must treat values of type double and long as atomic 
For example, the verifier reports a failure if the top value on the stack is a double and it encotmteis an 
^ instruction such as pop or dup . The instructions pop2 or dup2 must be used instead 

4.9.4 Instance Initialization Methods and Newly Created Objects 
Creating a new class instance is a multistep process. The Java statement 
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new myClassii, j. k) ; 
can be implemented by the following: 



new 91 // Allocate uninicialxzed space for myClass 

dup // Duplicate object on the operand stack 

iload^l // Push i 

40 xload_2 // Push j 

iload_3 // Push k 

invokespecial myClass .<init> // Initialize object 



This instruction sequence leaves the newly created and initialized object on top of the operand stack. (More 
examples of compiling Java code to the instruction set of the Java Virtual Machine are given in Chapter 7. 

"Compiling for the Java Virtual Machine."'^ 

The instance initialization method <init> for class myciass sees the new uninitialized object as its this 
argument in local variable 0. It must either invoke an alternative instance initialization method for class 
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myClass or invoke the initialization method of a superclass on the this object before it is allowed to do 
anything else with this. 

5 When doing dataflow analysis on instance methods, the verifier initializes local variable o to contain an object 

of the current class, or. for instance iniciaJization methods, local variable o contains a special type indicating an 
uninitialized object. After an appropriate initialization method is invoked (from the current class or the current 
superclass) on this objeci, all occurrences of this special type on the verifier's model of the operand stack and 
in the local variables arc replaced by the current class type. The verifier rejects code that uses the new object 
before it has been iniualized or that initializes the object twice. In addition, it ensures that every nomaal return 

10 of the method has either invoked an initialization method in die class of this method or in the direct superclass. 

Similarly, a special type is created and pushed on the verifier's model of the operand stack as the result of the 
Java Virtual Machine inscniction new . The special type indicates the instrucdon by which the class instance 
was created and the type of the uninitialized class instance created. When an initialization method is invoked on 
that class instance, all occurrences of the special type arc replaced by the intended type of the class instance. 
75 This change in type may propagate to subsequent instrucdons as the dataflow analysis proceeds. 

The instruction number needs to be stored as pan of die special type, as there may be muldple 
not-yet-initialized instances of a class in existence on the operand stack a one time. For example, the Java 
Viitual Machine instrucdon sequence that inqjlements 

20 new InpucS cream (new Foo ( ) . new InpuCStrearat ' foo" ) ) 

may have two uniriidalized instances of inputs t ream on the operand stack at once. When an iniualizadon 
method is invoked on a class instance, only diose occurrences of the special type on the operand stack or in die 
registers tha are the same object as the class instance are replaced. 

2s A valid instruction sequence must not have an uninitialized object on the operand stack or in a local variable 
during a backwards branch, or in a local variable in code protected by an exception handler or a finally 
clause. Otherwise, a devious piece of code might fool the verifier into thinking it had inidalized a class instance 
when it had, in facu initiaiized a class instance created in a previous pass through the loop. 

4.9.5 Exception Handlers 

30 

Java Virtual. Machine code produced from Sun's Java compiler always generates exception handlers such dial: 

• The ranges of instrucdons protected by two different exception handlers always arc either completely 
disjoint, or else one is a subrange of die odier. There is never a partial overlap of ranges. 

• The handler for an exception will never be inside die code that is being protected. 

35 • The only entry to an exception handler is dirough an exceptkn. It is impossible to fall dirough or 

"goto" the excepdon handler. 

These restrictions arc not enforced by die class file verifier since dicy do not pose a dueat to die integrity of 
the Java Vinual Machine. As long as every nonexccptional padi to the excepdon handler causes there to be a 
single object on the operand stack, and as long as all other criteria of die verifier are met. the verifier will pass 
^ the code. 

4.9.6 Exceptions and finally 
Given the fragment of Java code 

45 

( 

scarcFaucecIl ; 
wacerLawnO; 
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} finally { 

sCopFaucecO ; 

) 



the Java language guarantees that stopFaucec is invoked (the faucci is turned ofO whether we finish watering 
the lawn or whether an exception occurs while staning the faucet or watering the lawn. That is. the finally 
clause is guaranteed to be executed whether its try clause completes noimally, or completes abruptly by 
throwing an exception. 

To implement the try- finally construct, the Java compiler uses (he exception-handling facilities together 
with two special instructions yjr ("jump to subroutine") and ret ("return from subroutine"). The finally 
clause is compiled as a subroutine within the Java Virtual Machine code for its method, much like the code for 
an exception handler. When a jsr instruction that invokes the subroutine is executed, it pushes its recum 
address, the address of the instruction after the jsr that is being executed, onto the operand stack as a value of 
type retumMdress. The code for the subroutine stores the retum address in a local variable. At the end of 
Che subroutine, a ret instruction fetches the return address from the local variable and transfers control co the 
instruction at the renim address. 

Control can be transferred to the finally clause (the finally subroutine can be invoked) in several different 
ways. If the try clause completes normally, the finally subroutine is invoked via a jsr instruction before 
evaluating the next Java expression. A break or continue inside the try clause that transfers control outside 
the try clause executes a jsr to the code for the final ly clause first If the try clause executes a return, the 
con^iled code does the following: 

1 . Saves the return value (if any) in a local variable. 

2 . Executes a jsr to the code for the finally clause. 

3 : Upon remm from the finally clause, returns the value saved in the local variable. 

The compiler sets up a special exception handler which catches any exception thrown by the try clause. If an 
exception is thrown in the try clause, this exception handler does the following: 

1 . Saves the exception in a local variable. 

2. Executes a jsr to the finally clause. 

3 . Upon return from the finally clause, rethrows the exception. 

For more information about the implementation of Java's try- finally construct, see Section 7.13. 
"Compiling finailv " 

The code for the finally clause presents a special problem to the verifier. Usually, if a particular instruction 
can be reached via multiple paths and a particular local variable contains incompatible values through those 
multiple paths, then the local variable becomes unusable. However, a finally clause might be called from 
several different places, yielding several different circumsiancej: 

• The invocation from the exception handler may have a certain local variable that contains an exception. 

• The invocation to implement recum may have some local variable that contains the return value. 

• The invocation from the bottom of the try clause may have an indeterminate value in that same local 
variable. 

The code for the finally clause itself might pass verification, but after updating all the successors of the r^r 
instruction, the verifier would note that the local variable that the exception handler expects to hold an 
exception, or that the retum code expects to hold a remm value, now contains an indeterminate value. 

Verifying code that contains a finally clause is complicated. The basic idea is the following: 

• Each instruction keeps track of the list of jsr targets needed lo reach that instruction. For most code. 
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this list is empty. For Instructions inside code for the finally clause, it is of length one. For multiply 
nested finally code (extremely rare!), it may be longer than one. 

• For each instruction and each jsr needed to reach that instruction, a bit vector is maintained of all local 
variables accessed or modified since the execution of the jsr instruction. 

• When executing the ret instruction, which implements a return from a subroutine, there must be only 
one possible subroutine from which the instruction can be returning. Two different subroutines cannot 
"merge" their execution to a single ret instrucdon. 

• To perform the data-flow analysis on a ret instruction, a special procedure is used. Since the verifier 
knows the subroutine from which the insuuction must be returning, it can find all the jsr instructions 
thai call the subroutine and merge the state of the operand stack and local variables at the time of the ret 
instruction into the operand stack and local variables of the instmctions following theyjr . Merging 
uses a special set of values for the local variables: 

• For any local variable for which the bit vector (constructed above) indicates that the subroutine has 
accessed or modified, use the type of the local variable a the time of the ret, 

• For other local variables, use the type of the local variable before the /sr instrucdon. 



4J0 Limitations of the Java Virtual Machine and class File 
Format 

The following limitations in the Java Virtual Machine arc imposed by this vcnion of the Java Virtual Machine 
specificadon: 

• The pcr-class constant pool is limited to 65535 entries by the 16-bit cons tan t_pool_counr field of 
the classFile Structure ih^.W This acts as an internal limit on the total complexity of a single class. 

• The amount of code per mcdiod is limited to 65535 bytes by the sizes of the indices in the 
exception.cable of Code attribute ($4.7.'4). in the LineNuinberTable attribute (§4.7.$), and in 
the LocalVariableTable attribute [MJJl- 

• The number of local variables in a method is limited to 65535 by the two-byte index operand of many 
Java Virtual Machine instructions and die size of the ni&x.locals item of the ClassFile structure 
f$4.n . (Recall that values of type long and double are considered to occupy two local variables.) 

• The number of fields of a class is limited to 65535 by the size of die f ields_count item of the 

ClassFile structure L§iLIl- 

• The number of mcdiods of a class is limited to 65535 by die size of die methods.count item of die • 
ClassFile Structure £§4Ji- 

• The size of an operand stack is limited to 65535 words by die iiiax_stack field of die 
Code.actribute structure £§4J*41. . ru 

• The number of dimensions in an array is limited to 255 by die size of die dimensions opcode of die 
muUianewarray instruction, and by die constraints imposed on die multUmewarray , anewarray , and 
newarray instrucdons by ^.8.2 . 

• A valid Java mediod descriptor fS4.3.3) must require 255 or fewer words of mediod arguments, 
where diat limit includes die word for this in die case of instance mediod invocations. Note diat die 
limit is on the number of words of mediod aiguments. and not on number of arguments diemsclves. 
Arguments of type long and double are two words long; arguments of all odier types are one word 
long. 



* In retrospect, making eight-byte constants take two constant pool entries was a poor choice. 

^ The fact that end _pc is exclusive is an historical mistake in die Java Virtual Machine: if the Java Virtual 
Machine code for a method is exactly 65535 bytes long and ends widi an instruction that is one byte long, then 
dial instrucdon cannot be protected by an exception handler. A compiler writer can work around this bug by 
limiting the maximum size of the generated Java Vinual Machine code for any mediod, instance initialization 
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method, or static initializer (the size of any code anay) to 65534 bytes. 

^ The javac compiler In Sun's JDK 1.0.2 release can in fact generate LineNujnberTable attributes which are 
not in line number order and which are not one-to-one with source lines. This is unfortunate, as we would 
prefer to specify a one-to-one, ordexed mapping of LineNuznberTable attributes to source lines, but must yield 
to backward compatibility. 
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CHAPTER 5 

Constant Pool Resolution 



Java classes and interfaces arc dynamically loaded (^2.16.2). linked fS2.16.3). and initialized (§2.16.4) . 
Loading is the process of finding the binary form of a class or interf^ice type with a particular name and 
constructing, from that binary form, a class object to represent the class or interface. Linking is the process of 
taking a binary fomi of a class or interface type and combining it into the runtime state of the Java Vinual 
Machine so that it can be executed. Initialization of a class consists of executing its static initializers and die 
initializeis for static fields declared in the class. 

The Java Virtual Machine performs most aspects of these procedures through <^rations on a constant pool 
(^AY a per-type runtime data structure that serves many of the purposes of tbt symbol table of a conventional 
language. For example. Java Vimial Machine instructions that might otherwise have been designed to take 
immediate numeric or string operands instead fetch dieir operands fiom the constant pooL Qasses, methods, 
and fields, whether referenced from Java Virtual Machine instractions or fifom other constant pool entries, are 
named using the constant pool. 

A Java compiler docs not presume to know the way in which a Java Virtual Machine lays out classes, 
interfaces, class instances, or arrays. References in the constant pool are always initially symbolic. At run 
time, the symbolic representation of the reference in the constant pool is used to work out the actual location of 
the referenced entity. The process of dynamically determining concrete values from symbolic references in the 
constant pool is known as constant pool resolution . Constant pool resolution may involve loading one or more 
classes or interfaces, linking several types, and initializing types. There are several kinds of constant pool 
entries, and the details of resolution differ with the kind of entry to be resolved. 

Individual Java Virtual Machine instructions that reference entities in the constant pool are responsible.for 
resolving the entities they reference. Constant pool entries that are referenced from other constant pool enthes 

arc resolved when the referring entry is resolvcd. 

A given constant pool entry may be referred to from any number of Java Virtual Machine instructions or other 
constant pool entries; thus, constant pool resolution can be attempted on a constant pool entry that is already 
resolved. An attempt to resolve a constant pool entry that has already been successfiilly resolved always 
succeeds trivially, and always results in the same entity produced by the initial resolution of that entry. 

Constant pool resolution is normally initiated by the execution of a Java Vinual Machine instruction that 
references the constant pool. Rather than give the full description of the resolution process performed by Java 
Vinual Machine instructions in their individual descriptions, we will use this chapter to summarize the constant 
pool resolution process. We will specify the errors that must be detected when resolving each kind of constant 
pool entry, the order in which those errors must be responded to, and the errors thrown in response. 

When referenced from the context of certain Java Vinual Machine instructions, additional constraints are put on 
linking operations. For instance, the getfield instruction requires not only that the constant pool entry for the 
field it references can be successfully resolved, but also that the resolved field is not a class (static) field. If it 
is a class field, an exception must be thrown. Unking exceptions that are specific to the execution of a 
particular Java Virtual Machine instruction arc given in the description of that instruction and are not covered in 
this general discussion of constant pool resolution. Note that such exceptions, although described as pan of the 
execution of Java Vinual Machine instructions rather than constant pool resolution, are still property 
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considered failure of the linking phase of Java Vimial Machine execution. 

The Java Virtual Machine specification docunicnls and orders all cxcepuons that can arise as a result of 
constant pool resolution. It docs not mandate how they should be detected, only that they must be In addition 
as mentioned in SfiJ, any of the virtual machine errors listed as subclasses of virtuaiMachineError may be' 
thrown at any time during constant pool resolution. 



10 5.1 Class and Interface Resolution 

A constant pool entry tagged as coNSTANT.Class (UA.l) represents a class or interface. Various Java Vinual 
Machine instructions reference coNSTAKT.ciass entries in the constant pool of the class that is current upon 
their execution Several other kinds of constant pool entries (Ua.7) rxifcrcncc constant ciassCTlrics 
and cause those class or interface references to be resolved when the referencing entries art resoTvcd. For 
15 instance, before a method reference (a coNSTANT.Mechodref constant pool entry) can be resolved, the 

reference it makes to the class of the method (via the class.index item of the constant pool<ntry) must first 
be resolved. 

If a class or interface has not been resolved already, the details of the resolution process depend on what kind 
of entity is represented by the constant_ci«lss entry being resolval Array classes are handled diffcready 
20 from non-array classes and from interfaces. Details of the resolution process also depend on whether the 

reference prompting the resolution of this class or interface is from a class or inteiftce that was loaded a 

class loader (^2.16.2) . 

The naroe^index item of a coNSTANT_ciass constant pool entry is a reference to a constant utf 8 constant 
pool entry (§4.4.7) for a UTF-S string that represents the fully qualified name (^2 J.9) of the class or interface 
25 to be resolved. What kind of entity is represented by a constant class constant pool entry, and how to 

resolve that entry, is determined as follows: 

o If the first character of the fully qualified name of the constant pool entry to be resolved is not a left 
bracket ("("), then the entry is a reference to a non-array class or to an interface. 

• If die current class i^3.6S has not been loaded by a class loader, then "normal" class resolution is used 
30 (^SAAY 

• If the current class has been loaded by a class loader, then plication-defined code is used ($5.1^^ to 
resolve the class. 

• If the first character of the fiilly qualified name of the constant pool entry to be resolved is a left 
bracket ("( '). then the entry is a reference to an array class. Array classes are resolved specially 

35 

5.1.1 Current Class or Interface Not Loaded by a Class Loader 

If a class or interface that has been loaded, and that was not loaded using a class loader, references a non-array 
class or interface C, then the following steps are perfonncd to resolve the reference to C: 

^ 1 . The class or interface C and its superclasses are first loaded (^2.16.2> . 

2. If class or interface C has not been loaded yet, the Java Virtual Machine will search for a file C.ci&ss 
and attempt to load class or interface C from that file. Note that there is no guarantee that the file 
C. class will actually contain the class or interface C or that the file C. class is even a valid class 
file. It is also possible that class or interface C might have already been loaded, but not yet initialized. 
This phase of loading must detect the following errors: 
^ o If no file with the appropriate name can be found and read, class of interface resolution 

throws a NoClassOefFoundError. 
o Otherwise, if it is determined that the selected file is not a well-formed class file (pass I of 
44*2J.)« or is not a class file of a supponed major or minor version (S4JJ, class or interface 

so 
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resolution throws a Nociassoef FoundError. 
o Otherwise, if the selected class file did not actually contain (he desired class or interface, 

class or interface resolution throws a NoClassOefFoundError. 
o Otherwise, if the selected class file does not specify a superclass and is not the class file for 

class Object, class or interface resolution throws a ciassFormac-Error. 

3. If ihc superclass of the class being loaded has not yet been loaded, it is loaded using this step I 
recursively. Loading a superclass must detect any of the errors in step la, where this superclass is 
considered to be the class being loaded. Note that all interfaces must have j ava . lang . ob j ec t as their 
superclass, which must already have been loaded. 

4. If loading class C and its superclasses was successful, the superclass (and thus its superclasses, if 
any) of class C is linked and initialized by applying steps -2-4 recursively. 

5. The class C is linked (32.I6.3\ that is. it is verified f$4.9) and prepared. 

6. First, the class or interfecc C is verified to ensure that its binary representation is structurally valid 
(passes 2 and 3 of $4.9.1) .^ Vcrificadon may itself cause classes and interfaces to be loaded, but not 
ioidalized (to avoid circularity), using the procedure in step 1. 

o If Che class or interface C contained in class file C.class does not satisfy the static or 
stnictural constraints on valid class files listed in Section 4.8. "Constraints on Java Virtual 
Machine CodeJl class or interface resolution throws a veri f yError. 

7. If the class file for class or interface C is successfully verified, the class or inteiface is prepared. 
Preparation involves creating die stauc fields for the class or interface and initializing those fields to 
their standard default values (62.5. IV Pitparadon should ooc be confused with the execution of static 
initializers (^2.1 U : unlike execution of static initializers, prq}aration does not require the execution of 
any Java code. During preparation: 

o If a class that is not declared abs tract has an abstract niethod, class resolution throws an 
Abs tractMechodError. 

S . Certain checks diat are specific to individual Java Virtual Machine instructions, but that arc iQ^cally 
related to this phase of constant pool resolution, are described in the documentation of those 
mstructions. R)r instance, the getfield instnicdon resolves its field reference, and only afterward 
checks to see whether that field is an instance field (tiiat is. it is not static). Such exceptions are still 
considered and documented to be linking, not nintime, exceptions. 
9. Next, the class is initialized. Details of the initialization procedure arc given in 62.16.5 and in The 
Java Language Specification . 

o If an initializer completes abrupdy by throwing some exception E, and if the class of E is not 
Error Or one of its subclasses, then a new instance of the class 

ExceptionlnlnitializerError, widi E as the argument, is created and used in place of E. 
o If die Java Viz^ Machine attempts to create a new instance of the class 

ExceptionlnlnitializerError but is unable to do SO because an Out -of --Memory- Err or 

occurs, then the Outof MemoryError object is thrown instead. 
10. Finally, access permissicms to the class being resolved are checked: 

o If the current class or interface does not have permission to access the class or interface being 

resolved, class or inteiface resolution throws an illegal -Access-Error. This condition can 

occur, for example, if a class that is originally d.eclared public is changed to be private 

after another class that refers to die class has been compiled. 

If none of the preceding errors were detected, constant pool resolution of the class or interface reference must 
have completed successfully. However, if an error was detected, one of die following must be true. 

• If some exception is thrown in steps 1-4, the class being resolved must have been marked as unusable 
or must have been discarded. 

• If an exception is thrown in step 5. die class being resolved is still valid and usable. 

In either case, die resolution fails, and the class or interface attempting to perform die resolution is prohibited 
from accessing the referenced class or interface. 

5,1.2 Current Class or Interface Loaded by a Class Loader 
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If a class or inlcrfacc. loaded using a class loader, references a non-array class or interface C. then that same 
class loader is used to load C. The loadciass method of that class loader is invoked on the fiiUy qualified path 
name f§2.7.9) of the class to be resolved. The value returned by the loadciass method is the resolved class 
The remainder of the section describes this process in more detail. 

Every class loader is an instance of a subclass of the abstract class cXassLoader. Applications implement 
subclasses of classLoader in order to extend the manner in which the Java Virtual Machine dynamically loads 
classes. Class loaders can be used to create classes that originate from sources other than files. For example a 
class could be downloaded across a network, it could be generated on the fly, or it could be decrypted from a 
scrambled file. 

The Java Virtual Machine invokes the loadciass method of a class loader in order to cause it to load (and 
optionally link and initialize) a class. The first argument to loadciass is the fiilly qualified name of the class to 
be loaded. The second argument is a boolean. The value false indicates that the specified class must be 
loaded, but not linked or initialized; the value true indicates the class must be loaded, linked, and initialized. 

Implementations of class loaders arc required to keep track of which classes they have already loaded, linked, 
and initialized:^ 

• If a class loader is asked to load (but not link or initialize) a class or interface that it has already loaded 
(and possibly already linked and initialized), then it should "simply return that class or interface. 

• If a class loader is asked to load, link, and initialize a class or interface that it has already loaded but 
not yet linked and initialized, the class loader should not reload the class or interface, but should only 
link and initialize it. 

• If a class loader is asked to load, link, and initialize a class or interface that it has already loaded, 
linked, and initialized, the class loader should simply return that class or interface. 

When the class loader's loadciass method is invoked with the name of a class or interface that it has not yet 
loaded, the class loader must perfonn one of die following two operations in order to load the class or 
interface: 

• The class loader can create an array of bytes representing the bytes of a file of class file format; it 
then must invoke the method def ineciass of class ciassLoader on those bytes to convert them into 
a class or interface with this class loader as the class loader for the newly defined class. Invoking 
def ine-ciass causes the Java Virtual Machine to perform step la of ^5.1.1 . 

o Invoking def ineClass then causes the loadciass method of the class loader to be invoked 
recursively in order to load the superclass of the newly defined class or interface. The fully qualified 
path name of the superclass is derived from the super.class item in the class file format. When the 
superclass is loaded in, the second argument to loadciass is false, indicating that the superclass is 
not to be linked and initialized immediately. 

• The class loader can also invoke the stadc method f indsystenciass in class ciassLoader with the 
fully qualified name of the class or interface to be loaded. Invoking this method causes die Java Vinual 
Machine to perform step I of ^5.1.1 . The resulting class file is not marked as having been loaded by 
a class loader. 

After the class or interface and its superclasses have been loaded successfully, if the second argument to 
loadciass is true the class or interface is linked and initialized. This second argument is always crue if the 
class loader is being called upon to resolve an entry in the constant pool of a class or interface. The class loader 
links and initializes a class or interface by invoking the method resolveClass in the class ciassLoader. 
Unking and initializing a class or interface created by a class loader is very similar to linking and initializing a 
class or interface without a class loacfer (steps 2-4 of ISJLJL): 

First, the superclass of the class or interface is linked and initialized by calling the loadciass method of the 
class loader with the fully qualified name of the superclass as the first argument, and crue as the second 
argument. Linking and initialization may result in the superclass's own superclass being linked and initialized. 
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Linking and initialization of a superclass must detect any of the errors of step 3 of 5JLLi- 

Next, the byiccode verifier is run on the class or interface being linked and initialized. The verifier may itself 
5 need classes or interfaces to be loaded, and if so. it loads them by invoking the loadciass method of the same 

class loader with the second argument being false. Since verification may itself cause classes or interfaces to 
be loaded (but not linked or initialized, to avoid circularity), it must detect the errors of step I of il_LL for any 
classes or interfaces it attempts to load. Running the verifier may also cause the errors of step 3a of $5.1.1 . 

If the class file is successfully verified, the class or interface is then prepared (step 3b of 55JJJ and initialized 
10 (step 4 of 5LLD. 

Finally, access permissions to the class or interface arc checked (step 5 of 55.1.11 If the current class or 
interface docs not have permission to access the class being resolved, class resolution throws an 
IllegalAccessError exception. 

IS If none of die preceding errors were detected, loading, linking, and initializatioa of the class or interface must 
have completed successfully. 

5,1.3 Array Classes 

A constant pool entry tagged as coNSTANT_ciass (64.4. H represents an array class if the first character of the 
20 ITTF-S string ($4.4.7^ referenced by the name^indcx item of that constant pool entry is a left bracket (" t"). 

The number of initial consecutive left brackets in the name represents the number of dimensions of the array 
class. Following the one or more initial consecutive left brackets is a field descriptor f 643 .21 representing 
cither a primitive type or a non-array reference type; this field descriptor represents the base type of the array 
class. 

25 The following steps are performed to resolve an array class referenced from the constantpool of a class or 
interface: 

1 . Determine the number of dimensions of the array class and the field descriptor that represents the base 
type of the aaay class. 

2 . Determine the base type of the array class: 

30 

• If the field descriptor represents a primitive type (its first character is not "l"), thai primitive type is the 

base type of the array class, 
o If the field descriptor represents a non-anray reference type (its first character is "l"), Uiat reference 

type is the base type of the array class. The reference type is itself resolved using the procedures 

indicated above in 65.1.1 or in 65.1.2 . 

35 

1 . If an array class representing the same base type and the same number of dimensions has already been 
created, the result of the resolution is that array class. Otherwise, a new array class representing the 

indicated base type and number of dimensions is created. 



5.2 Field and Method Resolution 

A constant pool entry tagged as coNSTANT.Fieidref ^64.4.2^ represents a class or instance variable or 
a (constant) field of an interface f62. 13.4) . Note that interfaces do not have instance variables. A constant pool 
entry tagged as coNSTAWT_Hechodref (64.4 2) represents a method of a class (a static method) or of a class 
instance (an instance method). References to interface metiiods are made using 
C0NSTANT_rnterfaceMethodref constant poo! entries; resolution of such entries is described in lH. 

To resolve a field reference or a method reference, die coNSTANT.ciass IMAJl entry representing die class 



55 



51 



EP0913 769A2 



of which ihc field or method is a member must fiisi be successfully resolved ($5.1) . Thus, any exception thai 
can be ihrown when resolving a coNSTANT^ciass constant pool entry can also be thrown as a result of 
resolving a cONSTANT.Fieldref or CONSTWJT_ifetho<iref entry. If the coNSTANT.ciass entry representing 
the class or interface can be successfully resolved, exceptions relating to the linking of the method or field itself 
can be thrown. When resolving a field reference: 

• If the referenced field docs not exist in the specified class or interface, field resolution throws a 
NoSuchFleldError. 

• Otherwise, if the current class docs not have pennission to access the referenced field, field resolution 
throws an illegalAccessError exceptioa 

If resolving a method: 

• If the referenced method does not exist in the specified class or interface, field resolution throws a 
NoSuchMechodError. 

• Otherwise, if the current class does not have pennission to access the method being resolved, mediod 
resolution throws an 1 1 legalAccess Error excepdon. 



5.3 Interface Method Resolution 

A constant pool entry tagged as cONSTANT_intertaceMethodre£ (^4A2\ represents a call to an instance 
method declared by an interface. Such a constant pool entry is resolved by converting it into a 
machine-dependent intemal format. No error or excepdon is possible except for those documented in S6.3 . 



5.4 String Resolution 

A constant pool entry tagged as coNSTANT.scring (^.4.3> represents an instance of a string literal r$2.3V 
that is. a literal of the built-in type java . lang . string. The Unicode characters (^2.1) of the strine lircrai 
represented by the C0NSTANT_str ing entry are found in the coNSTANT_utf 8 ($4.4.7) constant pool enu-v chat 
the CONSTANT'S tring entry references. 

The Java language requires that idcnucal string litenb (that is, literals that contain the same sequence of 
Unicode characters) must reference the same instance of class string. In addition, if the method Luzem is 
called on any string, the result is a reference to the same class instance that would be recuroed if that SL-ing 
appeared as a literal. Thus, 

("a" ♦ -b- ♦ "c") .internd "abc" 

must have the value true.^ 

To resolve a constant pool entry tagged coNSTANT.string. the Java Virtual Machine examines the scries of 
Unicode characters represented by the UTF-8 string that the coNSTANT_string entry references. 

• If another constant pool entry lagged constant's t ring and representing the identical sequence of 
Unicode characters has already been resolved, then the result of resolution is a reference to the 
instance of class string created for that earlier constant pool entry. 

• Otherwise, if the method intern has previously been called on an instance of class string containing 
a sequence of Unicode characters identical lo that represented by the constant pool cntiy, then the 
result of resolution is a reference to that same instance of class string. 

• Otherwise, a new instance of class st r ing b created containing the sequence of Unicode characters 
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represented by the coNSTANT_string entry; that class instance is the result of resolution. 
No error or exception is possible during string resolution except for those documented in 1^6.3 . 

5 



5.5 Resolution of Other Constant Pool Items 

Constant pool entries that arc tagged CONSTANT.Integer or CONSTMJT.Float (S4.4.4V CONSTANT_Long or 

coNSTANT^Double f$4.4.5^ all have values that arc directly represented within the constant pool. Their 
resolution cannot throw exceptions except for those documented in in- 
constant pool entries that arc lagged cONSTANT.NameAndType (34.4.6^. and C0NSTANT_ut£8 ($4.4.7^ are 
never resolved direcdy. They arc only referenced directly or indirectly by odier constant pool entries. 

^ Sun's JDK release 1.0.2 only verifies class files that have class loaders; it assumes that class files loaded 
locally are trusted and do not need verification. 

^ Future implementations may change the API between the Java Virtual Machine and the class ClassLoader. 
20 Specifically, the Java Virtual Machine rather than the class loader will keep track of which classes and 

interfaces have been loaded by a particular class loader. One possibility is that the loadciass method will be 
called with a single argument indicating the class or interface to be loaded. The virtual machine will handle the 
details of linking and initialization and ensure that the class loader is not invoked with the same class or 
interface name muluple times. 

^ ^ String literal resolution is not implemented correctly in Sun's JDK release 1.0.2. In that implementation of 

the Java Virtual Machine, resolving a constant's txing in the constant pool always allocates a new string. 
Two string literals in two different classes, even if they contained the identical sequence of characters, would 
never be == to each other, A siring literal could never be to a result of the intern method. 

30 Contents I Prev I Next I Index 
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Claims 

40 

1 - A method of pre-prooessing dass files comprising: 

determining a plurality cf duplicated elements in a plurality of dass files; 
forming a shared table comprising said plurality of duplicated elements; 
45 removing said duplicated elements from said plurality of dass files to obtain a plurafity of reduced dass files: 

and 

forming a multi-dass file comprising said plurality of reduced dass files and said shared table. 

2. The method of claim 1 . further comprising: 

so 

computing an individual memory allocation requirement for each of said plurality of reduced dass files; 
computing a total memory allocation requirement for said plurafity of dass files from said Individual memory 
allocation requirement of each of said plurafity of reduced dass files; and 
storing said total memory aUocation requh'ement in said multi-dass file. 

55 

3. TTie method of claim 2. further comprising: 

reading said total memory allocation requirement from said multi-class file; 
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allocating a portion of memay based on said total memory allocation requirement; and 
loading said reduced dass files and said shared table into said portion of memory. 

4. The method of claim 3. further comprising: 

5 

accessing said shared tattle in said portion of memory to obtain one or wore elements not found in one or more 
of said reduced dass files. 

5. The method of claim 1. wherein said step of determining a pluralfty of duplicated elements comprises: 

10 

determining one a more constants shared between two or more dass files. 

6. The method of daim 5, wherein said step of forming a shared table comprises: 

15 forming a shared constant table comprising said one or more constants shared between said two or more dass 

files. 

7. A computer program product comprising: 

a computer usable medium having computer readable program code embodied therein for pre-processing 
class files, said conputer program product comprising: 

computer readable program code configured to cause a conputer to determine a plurality of duplicated ele- 
ments in a plurality of dass files; 

computer readable program code configured to cause a computer to form a shared table conprising said plu- 
rality of duplicated elements; 

computer readable program code configured to cause a conputer to remove said duplicated elements from 
said plurality of dass files to obtain a plurality of reduced class files; and 

conputer readable program code configured to cause a computer to form a multi^lass file conprising said plu- 
rality of reduced dass files and said shared table. 

8. The computer program product of claim 7. further comprising: 

conputer readable program code configured to cause a computer to compute an individual memory allocation 
requirement of each of said plurality of reduced dass files; 

computer readable program code configured to cause a cooputer to compute a total memwy allocation 
requirement of said plurality of dass files from said individual memory allocation requirement of each of said 
plurality of reduced dass files; and 

conputer readable program code configured to cause a conputer to store said total memory allocation 
requirement in said multi-class file. 

The computer program product of daim 8. further comprising: 

conputer readable program code configured to cause a computer to read said total memory allocation require- 
ment from said multi-dass file; 

45 computer readable program code configured to cause a conputer to allocate a portion of memory based on 

said total memory allocation requirement; and 

conputer readable program code configured to cause a conputer to load said reduced dass f9es and said 
shared table into sad portion of memory. 
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50 1 0. The computer program pioduct of daim 9, further comprising: 

computer readable program code configured to cause a conputer to access said shared table in said portion 
of memory to obtain one or nfK>re elements not found in one or more of said reduced dass f Qes. 

55 11. The computer program product of daim 7, wherein said computer readable program code configured to cause a 
conputer to determine said plurality of duplicated elements conprises: 

conputer readable program code configured to cause a conputer to determine one or more constants shared 
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between two or more class files. 

1 2. The computer program product of claim 1 1 . wherein said computer readable program code configured to cause a 
computer to form said shared table comprises: 

5 

computer readable program code configured to cause a corrputer to form a shared constant table comprising 
said one or more constants shared between said two or more class files. 

13. An apparatus comprising: 

10 

a processor; 

a memory coupled to said processor; 

a plurality of dass files stored in said memory; 

a process executing on said processor, said process configured to form a multi-class file comprising: 

75 

a plurality of reduced class files obtained from said plurality of dass files by removing one or mae ele- 
ments that are duplicated between two <x more of said plurality of dass files; and 
a shared table comprising said duplicated elements. 

20 14. The apparatus of daim 13. wherein said multi-dass file further comprises a memory requirement said memory 
requirement being conr^ed by said process. 

15. The apparatus of daim 13. wherein said duplicated elements comprise elements of constant pools of respective 
class files, said shared table comprising a shared constant pool. 

25 

1 6. The apparatus of daim 13. further comprising: 

a virtual machine having a class loader and a runtime data area, said dass loader confi^red to obtain and 
load said multi-dass file into said runtime data area. 

30 

17. The apparatus of daim 16. wherein said dass loader is configured to allocate a portion of said runtime data area 
based on said memory requirement in said multi-dass file. 

18. The apparatus of daim 17, wherein said class loader is configured to load said plurality of reduced dass files and 
35 said shared table into said portion of said runtime data area 

1 9. The apparatus of daim 16, wherein said virtual machine is configured to access said shared table when a desired 
element associated with a first dass fie is not present in a corresponding one of said plurality of reduced dass f Oes. 

40 20. A menrx)ry configured to store data for access by a virtual machine executing in a computer system, oonprising: 

a data structure stored in said memory, said data structure comprising: 

a plurality of reduced class ffles assodated with a plurality of corresponding classes, said plurality of 
45 reduced dass files configured to be loaded by the virtual machine for execution of said plurality of dasses: 

a shared table comprising one or more elements that are duplicated t>etween two or more of said plurality 
of classes, said shared tafc>le configured to be loaded into the virtual machine to be accessed for said dupli- 
cated elements; and 

a memory requirement value configured to be read by a class loader of the virtual machine to allocate a 
50 portion of a runtime data area for loading said plurality of reduced dass files and said shared table. 

21 . The memory of claim 20. wherein said duplicated elements are removed from said plurality of reduced dass files. 

22. The memory of daim 20. wherein said duplicated elements comprise constants and said shared table comprises 
55 a shared constant pool. 

23. The memory of daim 20. wherein said memory requirement value is computed from individual menrK)ry require- 
ments of said pluiafity of reduced dass files and a memory requirement of said shared table. 
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